The audio-transcriber
is a command-line tool designed to transcribe audio files using the advanced Whisper model. It offers flexible support for different GPU backends, allowing you to leverage hardware acceleration with Vulkan, CUDA, HIPBLAS, and Metal.
The audio-transcriber
is a command-line tool designed to transcribe audio files using the advanced Whisper model. It offers flexible support for different GPU backends, allowing you to leverage hardware acceleration with Vulkan, CUDA, HIPBLAS, and Metal.
- Whisper Integration: Utilizes the whisper-rs library for accurate transcription.
- Multi-backend Support:
- Vulkan: Leverages GPU acceleration using the Vulkan API. Suitable for cross-platform applications and modern GPUs.
- CUDA: Optimized for NVIDIA GPUs with CUDA support. Ideal for high-performance computing on NVIDIA hardware.
- HIPBLAS: Utilizes AMD GPUs with HIPBLAS for high-performance linear algebra operations. Best suited for AMD GPU users.
- Metal: Supports Apple's Metal API for optimized performance on macOS systems.
Before installing and running the audio-transcriber
, ensure you have the following prerequisites:
-
Rust Toolchain:
- Ensure Rust is installed. You can download it from https://www.rust-lang.org/tools/install.
-
FFmpeg:
- The program checks for FFmpeg's presence and downloads it if missing. However, manual installation is possible on specific operating systems.
- Manual Installation:
- Windows: Download FFmpeg from here and extract the binaries to a directory in your PATH.
- macOS: Install via Homebrew:
brew install ffmpeg
- Linux: Install via package manager, e.g., on Ubuntu:
sudo apt-get update && sudo apt-get install ffmpeg
The recommended method to install the audio-transcriber
is via Cargo, Rust's package manager.
-
Clone the Repository
git clone https://github.com/woutermans/audio-transcriber.git cd audio-transcriber
-
Install Dependencies and Compile with Backend Features
To compile with specific backend support (e.g., Vulkan), enable the corresponding feature flag.
-
For Vulkan
cargo build --release --features vulkan
-
For CUDA
cargo build --release --features cuda
-
For HIPBLAS
cargo build --release --features hipblas
-
For Metal
cargo build --release --features metal
-
-
Using
cargo install
Alternatively, you can install it globally using Cargo's
install
command with a specified backend:cargo install audio-transcriber --features vulkan
The audio-transcriber
supports the following GPU backends for enhanced performance. Enabling a feature will enable the corresponding backend during compilation.
-
Vulkan
- Utilizes GPU acceleration using the Vulkan API.
- Requires system libraries and drivers compatible with Vulkan.
-
CUDA
- Optimized for NVIDIA GPUs.
- Requires CUDA toolkit installation on your system.
-
HIPBLAS
- Leverages AMD GPUs with HIPBLAS support.
- Requires ROCm or other compatible GPU drivers.
-
Metal
- Supports Apple's Metal API for optimized performance on macOS systems.
Additional Notes:
Ensure that your system meets the hardware and software requirements for each backend. Refer to the documentation provided by NVIDIA, AMD, or Apple for installation guides specific to each GPU architecture.
-
Basic Transcription
cargo run --release --features vulkan /path/to/audio [model_path]
Replace
/path/to/audio
with the path to your audio file and optionally provide a custom model path (default isggml-large-v3-turbo.bin
). -
Handling Multiple Backends
To switch between backends, enable the desired feature flag during compilation or use environment variables if supported.
-
Transcribe an audio file using CUDA:
cargo run --release --features cuda /path/to/audio
-
Transcribe an audio file using Metal on macOS:
cargo run --release --features metal /path/to/audio
The project relies on several crates for functionality:
hound
: For reading WAV files.whisper-rs
: Integration with the Whisper model.reqwest
: Handles HTTP requests for downloading FFmpeg.tempfile
andzip
: Manage temporary files and compress/decompress archives.indicatif
: Displays progress bars during transcription.
Contributions are welcome! Please follow these steps:
- Fork the Repository
- Create a New Branch
- Make Your Changes
- Run Tests
cargo test
- Submit a Pull Request
Ensure your code adheres to Rust's best practices and the project's coding standards.
This project follows the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code.
- Fork the Repository
- Create a New Branch
- Make Your Changes
- Run Tests
- Submit a Pull Request
This project is released under the Unlicense - see the LICENSE file for details.