Skip to content

A fictional audio CAPTCHA breaker using machine learning.

License

Notifications You must be signed in to change notification settings

monolli/CaptchAI

Repository files navigation

CaptchAI

This is a speech recognition project built with the purpose of exploring feature engineering in audio samples and Python best practices.

The challenge is to break a ficticious audio CAPTCHA formed a by a sequence of four characters. The CAPTCHAs were built with audio samples that have been recorded by volunteer students of the Universidade Federal do ABC. The samples were recorded with diverse microphones, in other words, expected a variety of background noises. The character sequence was randomly assembled, so you will find nonmatching voices in the same CAPTCHA.

The proposed solution uses the Random Forest algorithm from the Scikit-learn package.

The original audio samples are not publicly available in order to preserve the privacy of the volunteers.

Getting Started

Prerequisites

You must have Python 3.7 or greater and Pip installed.

Installing

Install the dependencies using the requirements.txt file.

pip install -r requirements.txt

Data Prep

In case you have a a folder with ".wav" samples and would like to use it, you should place them in a "data" folder structured as following and run the data prep script:

./data/training
./data/validation
./data/test
python data_prep.py

Training

In order to train the model you should run the following command:

python train_model.py

Predicting

Run the following command in order to make predictions over the test dataset:

python run_model.py

Graphs

A mel-spectrogram can be generated by running:

python generate_graphs.py

Built With

  • Python - The programming language.
  • Scikit-learn - Used to train the model and make predictions.
  • Pandas - Used to generate DataFrames.
  • Librosa - Used to manipulate the audio files and extract some features.

Authors

  • Lucas Monteiro de Oliveira - Coding - Monolli
  • João Victor Fontinelle Consonni - Report - Cojonni

License

This project is licensed under the GNU GPL3 License - see the LICENSE.md file for details

Acknowledgments

About

A fictional audio CAPTCHA breaker using machine learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages