Real-Time AI-Powered Transcription and Translation Tool
This project is a state-of-the-art real-time tool designed for transcription and translation. By leveraging cutting-edge AI technologies, it delivers fast and accurate speech-to-text transcription and advanced multi-language translation powered by NLLB-200.
With a Rust-based backend for high performance and a lightweight Next.js frontend for seamless cross-platform usability, this tool ensures top-tier results while running efficiently on local machines.
- Real-Time Transcription
Powered by Whisper Turbo (whisper-rs
) for fast and precise speech-to-text processing. - Efficient Audio Handling
Processes audio in real-time using WebGPU and FFmpeg for smooth, optimized performance. - Accurate Translation
Utilizes the NLLB-200 model from Hugging Face for multilingual translation across 200+ languages with unparalleled accuracy. - Cross-Platform Frontend
Built with Next.js to deliver a responsive and user-friendly interface. - State Management
Managed via Zustand for a streamlined and reactive application state.
Layer | Technology | Purpose |
---|---|---|
Transcription | Whisper Turbo | Fast, accurate speech-to-text transcription |
Audio Handling | WebGPU + FFmpeg | Real-time audio processing |
Translation | NLLB-200 | Advanced multilingual translation |
Frontend | Next.js | Lightweight, cross-platform UI |
State Management | Zustand | Simplified and reactive state management |
- NLLB-200: Hugging Face Documentation
- Whisper Turbo: GitHub | Discussion
- Audio Processing with FFmpeg: Docs
- Realtime Whisper with WebGPU: Example
- Client-Side Translator Tutorial: Codemotion
git clone https://github.com/itsyuimorii/chirimiri_lingo_AI_lively_tranlate_tool.git
cd chirimiri_lingo_AI_lively_tranlate_tool
npm install
npm run dev
npm run build
npm start
- Expand support for additional input formats and languages.
- Integrate with streaming platforms for live transcription and translation.
- Optimize further for on-device performance to enable offline functionality.
A real-time AI-powered transcription and translation tool. This tool processes live audio and translates it into multiple languages—all while ensuring privacy with local, client-side processing. Ideal for meetings, webinars, and live events.