Bias Correction with Pre-trained Audio Embeddings

Implementation of different bias correction methods for pretrained audio embeddings proposed in the following paper:

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

(Ps: deem in deem.py is an acronym for "debiasing embeddings")

Content

Installation
Datasets
Pre-trained embeddings
Bias correction
Note
Contact
Cite

Installation

We recommend using Conda environment:

git clone https://github.com/changhongw/audio-embedding-bias.git
conda env create -f environment.yml
conda activate embedding-bias

Datasets

Download IRMAS and OpenMIC datasets and save in directories data/irmas and data/openmic-2018, respectively.

Pre-trained embeddings

Extract VGGish, OpenL3, and YAMNet embeddings for both datasets. Or use our extracted pre-trained embeddings directly.

Bias correction

Run the note books in notebooks:

0_data_distribution.ipynb: investigate the distribution of each dataset in terms of genre distribution and number of samples per class
1_debias_linear.ipynb: linear bias correction (original, LDA, mLDA)
2_debias_nonlinear.ipynb: nonlinear bias correction (K, KLDA, mKLDA)
3_cosine_similarity.ipynb: calculate cosine similarity between dataset separation and instrument classification; check matrix rank for the case of multiple bias correction
4_result_summary.ipynb: summarize results from all bias correction methods

Note

Thanks to Jayeon Yi, we notice two typos in the paper, i.e. the dimensionality of $W$ and $U$ in Equation (3). We correct them as following:

$W\in\mathbb{R}^{D\times G}$ -> $W\in\mathbb{R}^{G\times D}$
$U\in\mathbb{R}^{D\times G}$ -> $U\in\mathbb{R}^{G\times G}$

Contact

For any questions, support, or inquiries, please feel free to contact changhong.wang@telecom-paris.fr.

Cite

Please cite the following paper if you use the code provided in this repository.

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

@inproceedings{wang2023transfer,
    author = {Changhong Wang and Gaël Richard and Brian McFee},
    title = {Transfer Learning and Bias Correction with Pre-trained Audio Embeddings},
    booktitle = {Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference},
    year = 2023,
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Bias Correction with Pre-trained Audio Embeddings

Content

Installation

Datasets

Pre-trained embeddings

Bias correction

Note

Contact

Cite

Files

README.md

Latest commit

History

README.md

File metadata and controls

Bias Correction with Pre-trained Audio Embeddings

Content

Installation

Datasets

Pre-trained embeddings

Bias correction

Note

Contact

Cite