Implementation of different bias correction methods for pretrained audio embeddings proposed in the following paper:
Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.
(Ps: deem in deem.py
is an acronym for "debiasing embeddings")
We recommend using Conda environment:
git clone https://github.com/changhongw/audio-embedding-bias.git
conda env create -f environment.yml
conda activate embedding-bias
Download IRMAS and OpenMIC datasets and save in directories data/irmas
and data/openmic-2018
, respectively.
Extract VGGish, OpenL3, and YAMNet embeddings for both datasets. Or use our extracted pre-trained embeddings directly.
Run the note books in notebooks
:
0_data_distribution.ipynb
: investigate the distribution of each dataset in terms of genre distribution and number of samples per class1_debias_linear.ipynb
: linear bias correction (original, LDA, mLDA)2_debias_nonlinear.ipynb
: nonlinear bias correction (K, KLDA, mKLDA)3_cosine_similarity.ipynb
: calculate cosine similarity between dataset separation and instrument classification; check matrix rank for the case of multiple bias correction4_result_summary.ipynb
: summarize results from all bias correction methods
Thanks to Jayeon Yi, we notice two typos in the paper, i.e. the dimensionality of
-
$W\in\mathbb{R}^{D\times G}$ ->$W\in\mathbb{R}^{G\times D}$ -
$U\in\mathbb{R}^{D\times G}$ ->$U\in\mathbb{R}^{G\times G}$
For any questions, support, or inquiries, please feel free to contact [email protected].
Please cite the following paper if you use the code provided in this repository.
Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.
@inproceedings{wang2023transfer,
author = {Changhong Wang and Gaël Richard and Brian McFee},
title = {Transfer Learning and Bias Correction with Pre-trained Audio Embeddings},
booktitle = {Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference},
year = 2023,
}