A simple GUI application that uses OpenAI's Whisper API to transcribe audio files into text.
- Transcribe audio files in formats like MP3, WAV, M4A, and FLAC.
- Automatically compresses files larger than 25 MB.
- Displays transcription results within the application.
- Copy transcription text to the clipboard.
- Python 3.11.10
tk
requests
ffmpeg
(for audio compression)yt-dlp
(for video downloading)
-
Clone the repository:
git clone github.com/awerks/whisper_gui.git
-
Navigate to the project directory:
cd whisper_gui
-
Create and activate the Conda environment:
conda env create -n transcription -f environment.yml conda activate transcription
-
Install
ffmpeg
:-
On macOS using Homebrew:
brew install ffmpeg
-
On Windows, download from FFmpeg website.
-
-
Install
yt-dlp
:-
On macOS using Homebrew:
brew install yt-dlp
-
On Windows, download from yt-dlp releases.
-
-
Set up the OpenAI API key:
-
Obtain your OpenAI API key and set it as an environment variable:
-
On Linux/macOS:
export OPENAI_TRANSCRIPTION_KEY='your_api_key_here'
-
On Windows:
set OPENAI_TRANSCRIPTION_KEY='your_api_key_here'
-
-
Run the application:
python3 program.py
- A GUI window will open where you can select an audio file and transcribe it.
- Transcription results will appear in the text area of the application.
- Use the "Copy to Clipboard" button to copy the transcription.
- Ensure
ffmpeg
is installed and added to your system's PATH. - The application supports audio files up to 25 MB without compression.
- Larger files will be compressed automatically before transcription.
Distributed under the MIT License.
- Powered by OpenAI's Whisper API.
- FFmpeg.
- yt-dlp