SpeakSieve

SpeakSieve is a project that aims to transcribe audio files and provide filtered dialogue based on the speaker. It has a frontend built with React and a backend built with FastAPI.

Getting Started

Prerequisites

Python 3.7 or higher
Node.js 12.0 or higher

Installation

Clone the repository

https://github.com/utkar22/CSE508_Winter2023_Group2_Project.git

Create a virtual environment and activate it

cd CSE508_Winter2023_Group2_Project/
python -m venv env
source env/bin/activate # for Linux/Mac
env/Scripts/activate # for Windows

Install required python packages

pip install -r requirements.txt

Install required node packages

cd frontend
npm install

Running the project

Start the backend server

cd backend
python main.py

In a seperate terminal launch the reach app

cd frontend
npm start

Open your browser and navigate to http://localhost:3000/ to access the SpeakSieve app.

Usage

Upload an audio file in supported format (mp3)
Choose a model size from the dropdown. The default being used is base.
Choose language of the audio. (English/Any)
Enter number of speakers. Default = 1
Wait for transcription to finish. (This step might take time depending on the duration of the audio and the model size chosen)

Project Structure

CSE508_Winter2023_Group2_Project
├─ .git
├─ .gitignore
├─ backend
│  ├─ audio.wav
│  ├─ audio_files
│  ├─ extract_phrases.py
│  ├─ get_all_dialogues.py
│  ├─ main.py
│  ├─ speaker_tags_generator.py
│  ├─ transcript-word.csv
│  ├─ transcript-word_bleeped.csv
│  ├─ transcript.csv
│  ├─ transcript.txt
│  └─ voice_censoring_api.py
├─ Censoring
│  ├─ VOSK.ipynb
│  └─ vosk.py
├─ Extract_Phrase
│  └─ extract_phrases.py
├─ frontend
│  ├─ .gitignore
│  ├─ package-lock.json
│  ├─ package.json
│  ├─ public
│  │  ├─ favicon.ico
│  │  ├─ index.html
│  │  ├─ logo192.png
│  │  ├─ logo512.png
│  │  ├─ manifest.json
│  │  └─ robots.txt
│  ├─ README.md
│  └─ src
│     ├─ App.css
│     ├─ App.js
│     ├─ App.test.js
│     ├─ components
│     │  ├─ ConfirmedPage.css
│     │  ├─ ConfirmedPage.jsx
│     │  ├─ CustomNavbar.jsx
│     │  ├─ Home.css
│     │  ├─ Home.jsx
│     │  ├─ sample.mp3
│     │  └─ TranscriptionPage.jsx
│     ├─ index.css
│     ├─ index.js
│     ├─ logo.svg
│     ├─ reportWebVitals.js
│     └─ setupTests.js
├─ model-final
│  ├─ environment.yml
│  ├─ hailhydra1.mp3
│  ├─ speaker-separate.py
│  ├─ speakerTags.py
│  └─ speakerTags2.py
├─ model-testing
│  ├─ audio.wav
│  ├─ Baseline Results.ipynb
│  ├─ female-female-mixture.wav
│  ├─ female-female-mixture_est1.wav
│  ├─ female-female-mixture_est2.wav
│  ├─ female-male-mixture.wav
│  ├─ mono_audio.wav
│  ├─ mono_audio_est1.wav
│  ├─ mono_audio_est2.wav
│  ├─ single-source-transcribe.wav
│  ├─ transcript.txt
│  ├─ transcript2.txt
│  └─ transcripts_with_speaker_names.ipynb
├─ README.md
└─ requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeakSieve

Getting Started

Prerequisites

Installation

Running the project

Usage

Project Structure

About

Releases

Packages

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
Censoring		Censoring
Extract_Phrase		Extract_Phrase
backend		backend
frontend		frontend
model-final		model-final
model-testing		model-testing
.gitattributes		.gitattributes
.gitignore		.gitignore
PPT.pdf		PPT.pdf
Project Report.pdf		Project Report.pdf
README.md		README.md
requirements.txt		requirements.txt

utkar22/SpeakSieve

Folders and files

Latest commit

History

Repository files navigation

SpeakSieve

Getting Started

Prerequisites

Installation

Running the project

Usage

Project Structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages