-
Notifications
You must be signed in to change notification settings - Fork 106
Gemini AI Audio Transcription python Module
ي edited this page Jan 28, 2025
·
1 revision
The gemini_audio_text.py
module is designed to transcribe audio files using Google's Gemini Pro model. It includes functionality to load environment variables, configure the Google API, and handle audio transcription.
Description: Loads environment variables from a .env
file.
def load_environment():
load_dotenv()
logger.info("Environment variables loaded successfully.")
Description: Configures the Google Gemini API for audio transcription.
Raises: ValueError
if the GEMINI_API_KEY
environment variable is not set.
def configure_google_api():
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
error_message = "Google API key not found. Please set the GEMINI_API_KEY environment variable."
logger.error(error_message)
raise ValueError(error_message)
genai.configure(api_key=api_key)
logger.info("Google Gemini API configured successfully.")
Description: Transcribes audio using Google's Gemini Pro model. Args:
-
audio_file_path (str)
: The path to the audio file to be transcribed. Returns: -
str
: The transcribed text from the audio. ReturnsNone
if transcription fails. Raises: -
FileNotFoundError
if the audio file is not found.
def transcribe_audio(audio_file_path):
try:
load_environment()
configure_google_api()
logger.info(f"Attempting to transcribe audio file: {audio_file_path}")
if not os.path.exists(audio_file_path):
error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
logger.error(error_message)
raise FileNotFoundError(error_message)
model = genai.GenerativeModel(model_name="gemini-1.5-flash")
try:
audio_file = genai.upload_file(audio_file_path)
logger.info(f"Audio file uploaded successfully: {audio_file=}")
except FileNotFoundError:
error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
logger.error(error_message)
raise FileNotFoundError(error_message)
except Exception as e:
logger.error(f"Error uploading audio file: {e}")
return None
try:
response = model.generate_content([
"Transcribe the following audio:",
audio_file
])
if response and hasattr(response, 'text'):
transcript = response.text
logger.info(f"Transcription successful:\n{transcript}")
return transcript
else:
logger.warning("Transcription failed: Invalid or empty response from API.")
return None
except Exception as e:
logger.error(f"Error during transcription: {e}")
return None
except Exception as e:
logger.error(f"An unexpected error occurred: {e}")
return None
- Ensure you have a
.env
file with the following environment variables:-
GEMINI_API_KEY
: Your Google API key.
-
- Call the
transcribe_audio
function with the path to your audio file:transcript = transcribe_audio("path/to/your/audio/file.wav")
os
sys
google.generativeai
dotenv
loguru
The module uses the loguru
library for logging to the console with colorized and formatted messages.