Gemini AI Audio Transcription python Module

Overview

The gemini_audio_text.py module is designed to transcribe audio files using Google's Gemini Pro model. It includes functionality to load environment variables, configure the Google API, and handle audio transcription.

Functions

1. `load_environment()`

Description: Loads environment variables from a .env file.

def load_environment():
    load_dotenv()
    logger.info("Environment variables loaded successfully.")

2. `configure_google_api()`

Description: Configures the Google Gemini API for audio transcription. Raises: ValueError if the GEMINI_API_KEY environment variable is not set.

def configure_google_api():
    api_key = os.getenv("GEMINI_API_KEY")
    if not api_key:
        error_message = "Google API key not found. Please set the GEMINI_API_KEY environment variable."
        logger.error(error_message)
        raise ValueError(error_message)
    
    genai.configure(api_key=api_key)
    logger.info("Google Gemini API configured successfully.")

3. `transcribe_audio(audio_file_path)`

Description: Transcribes audio using Google's Gemini Pro model. Args:

audio_file_path (str): The path to the audio file to be transcribed. Returns:
str: The transcribed text from the audio. Returns None if transcription fails. Raises:
FileNotFoundError if the audio file is not found.

def transcribe_audio(audio_file_path):
    try:
        load_environment()
        configure_google_api()

        logger.info(f"Attempting to transcribe audio file: {audio_file_path}")

        if not os.path.exists(audio_file_path):
            error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
            logger.error(error_message)
            raise FileNotFoundError(error_message)

        model = genai.GenerativeModel(model_name="gemini-1.5-flash")

        try:
            audio_file = genai.upload_file(audio_file_path)
            logger.info(f"Audio file uploaded successfully: {audio_file=}")
        except FileNotFoundError:
            error_message = f"FileNotFoundError: The audio file at {audio_file_path} does not exist."
            logger.error(error_message)
            raise FileNotFoundError(error_message) 
        except Exception as e:
            logger.error(f"Error uploading audio file: {e}")
            return None

        try:
            response = model.generate_content([
                "Transcribe the following audio:",
                audio_file
            ])

            if response and hasattr(response, 'text'):
                transcript = response.text
                logger.info(f"Transcription successful:\n{transcript}")
                return transcript
            else:
                logger.warning("Transcription failed: Invalid or empty response from API.")
                return None

        except Exception as e:
            logger.error(f"Error during transcription: {e}")
            return None

    except Exception as e:
        logger.error(f"An unexpected error occurred: {e}")
        return None

Usage

Ensure you have a .env file with the following environment variables:
- GEMINI_API_KEY: Your Google API key.
Call the transcribe_audio function with the path to your audio file:
```
transcript = transcribe_audio("path/to/your/audio/file.wav")
```

Dependencies

os
sys
google.generativeai
dotenv
loguru

Logging

The module uses the loguru library for logging to the console with colorized and formatted messages.

Alwrity AI writer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemini AI Audio Transcription python Module

Overview

Functions

1. `load_environment()`

2. `configure_google_api()`

3. `transcribe_audio(audio_file_path)`

Usage

Dependencies

Logging

Clone this wiki locally

Gemini AI Audio Transcription python Module

Overview

Functions

1. load_environment()

2. configure_google_api()

3. transcribe_audio(audio_file_path)

Usage

Dependencies

Logging

Clone this wiki locally

1. `load_environment()`

2. `configure_google_api()`

3. `transcribe_audio(audio_file_path)`