Phone Classification using Wav2Vec2

This repository contains Speechbrain recipes to fine-tune Wav2Vec2 models on a phone classification task. Following factors were analysed:

Fine-tuning Wav2Vec2,
Pre-training datasets,
Model size,
fine-tuning datasets.

Results of this work have been published at the Interspeech 2024 conference.

Code

The recipes folder contains all Speechbrain recipes.
Results obtained are available in the confusion-matrix/ folder.

Data

For confidentiality reasons, datasets are not included. This work relies on the C2SI, CommonPhone and BREF corpora.

Recipes

Details of some of the Speechbrain recipes set up in this repository.

unfrozen-cp-3k-large-accents is the best recipe published in the Interspeech paper listed below.
unfrozen-cp-3k-large-accents-argmax takes the maximum of all 6 segments (1024-dim). LeakyReLu.
unfrozen-cp-3k-large-concatenate take both central segments (2048-dim) as input to the classifier.

How to cite

If you use this work, please cite as:

@inproceedings{maisonneuve24,
  title     = {Towards objective and interpretable speech disorder assessment: a comparative analysis of CNN and transformer-based models},
  author    = {Malo Maisonneuve and Corinne Fredouille and Muriel Lalain and Alain Ghio and Virginie Woisard},
  year      = {2024},
  booktitle = {Interspeech 2024},
  pages     = {1970--1974},
  doi       = {10.21437/Interspeech.2024-267},
  issn      = {2958-1796},
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
accuracies		accuracies
configuration		configuration
confusion-matrix		confusion-matrix
post-analysis		post-analysis
recipes		recipes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
script.sh		script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phone Classification using Wav2Vec2

Code

Data

Recipes

How to cite

About

Releases

Packages

Languages

License

MaloMn/wav2vec2-phone-classification

Folders and files

Latest commit

History

Repository files navigation

Phone Classification using Wav2Vec2

Code

Data

Recipes

How to cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages