Skip to content

Scripts that use Deepgram's (Whisper) json to get YouTube-friendly subtitle format and use Chat GPT API to improve the auto-generated text

Notifications You must be signed in to change notification settings

Tanya301/podcast-subtitle-enhancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Podcast subtitle enhancer

This repository contains the code used for subtitle generation for the PostgresFM podcast.

  • Here, the process is shown for the 60th episode of the podcast.

1. Generating inital transcript from audio file

To auto-generate the initial transcript, I used Deepgram (Whisper). The json file was also formatted for easier reading.

export YOUR_SECRET=INSERT_DEEPGRAM_KEY

curl -X POST \
  -H "Authorization: Token $YOUR_SECRET" \
  -H 'content-type: application/json' \
  -d '{"url":"https://https://rupostgres.org/060%20Decoupled%20storage%20and%20compute.m4a"}' \
  "https://api.deepgram.com/v1/listen?model=whisper-large&punctuate=true&smart_format=true&diarize=true" > subs.json

bash

cat result.json | jq -r '.' > formatted.json

2. Forming subtitles from the auto-generated transcript

formatted.json gives us separate words with their timecodes. The goal here was to split them into phrases based on the following factors:

  • who is speaking at the moment
  • character count
  • punctuation Besides that, the subtitles are returned in the YouTube-friendly format (.srt).

The algorithm can be found in whisper2srt.py.

##3 Enhancing subtitles using Chat GPT API After running the script from the previous step, we get subtitles that are better than the usual auto-generated subtitles but are still far from perfect. The next goal is to use Chat GPT to fix misspellings and incorrectly recognized words. This is the prompt that was used (prompt.txt):

Below are subtitles, auto-generated by OpenAI Whisper, for a podcast PostgresFM episode

- This is auto-generated (voice2text), help me improve it
- DO NOT CHANGE WORDS! ONLY FIX TYPOS AND TERMS!
- Do NOT add anything, subtitles need to stay true to the video
- Do not change meanings
- Hunt for typos and incorrectly recognized terms, but do not correct grammar mistakes
- Use the glossary to fix incorrectly recognized words
- Leave it in the subtitle format

Glossary:

PostgresFM
Postgres, PostgreSQL
...

Following is the long list of words that are context-specific and are most likely to be incorrectly recognized.

The algorithm can be found in subs_gpt_no_key.py.


Of course, the subtitles received from this process are not perfect and require manual correction. However, they made the process of generating accurate subitles for the podcast much easier and faster.

About

Scripts that use Deepgram's (Whisper) json to get YouTube-friendly subtitle format and use Chat GPT API to improve the auto-generated text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages