Bias Correction with Pre-trained Audio Embeddings

Implementation of different bias correction methods for pretrained audio embeddings proposed in the following paper:

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

(Ps: deem in deem.py is an acronym for "debiasing embeddings")

Installation

We recommend using Conda environment:

git clone https://github.com/changhongw/audio-embedding-bias.git
conda env create -f environment.yml
conda activate embedding-bias

Datasets

Download IRMAS and OpenMIC datasets and save in directories data/irmas and data/openmic-2018, respectively.

Pre-trained embeddings

Extract VGGish, OpenL3, and YAMNet embeddings for both datasets. Or use our extracted pre-trained embeddings directly.

Bias correction

Run the note books in notebooks:

0_data_distribution.ipynb: investigate the distribution of each dataset in terms of genre distribution and number of samples per class
1_debias_linear.ipynb: linear bias correction (original, LDA, mLDA)
2_debias_nonlinear.ipynb: nonlinear bias correction (K, KLDA, mKLDA)
3_cosine_similarity.ipynb: calculate cosine similarity between dataset separation and instrument classification; check matrix rank for the case of multiple bias correction
4_result_summary.ipynb: summarize results from all bias correction methods

Note

Thanks to Jayeon Yi, we notice two typos in the paper, i.e. the dimensionality of $W$ and $U$ in Equation (3). We correct them as following:

$W\in\mathbb{R}^{D\times G}$ -> $W\in\mathbb{R}^{G\times D}$
$U\in\mathbb{R}^{D\times G}$ -> $U\in\mathbb{R}^{G\times G}$

Contact

For any questions, support, or inquiries, please feel free to contact [email protected].

Cite

Please cite the following paper if you use the code provided in this repository.

Changhong Wang, Gaël Richard, and Brian McFee. "Transfer Learning and Bias Correction with Pre-trained Audio Embeddings". Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, 2023.

@inproceedings{wang2023transfer,
    author = {Changhong Wang and Gaël Richard and Brian McFee},
    title = {Transfer Learning and Bias Correction with Pre-trained Audio Embeddings},
    booktitle = {Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference},
    year = 2023,
}

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
data		data
models		models
notebooks		notebooks
results		results
.gitignore		.gitignore
README.md		README.md
deem.py		deem.py
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bias Correction with Pre-trained Audio Embeddings

Content

Installation

Datasets

Pre-trained embeddings

Bias correction

Note

Contact

Cite

About

Releases

Packages

Contributors 2

Languages

changhongw/audio-embedding-bias

Folders and files

Latest commit

History

Repository files navigation

Bias Correction with Pre-trained Audio Embeddings

Content

Installation

Datasets

Pre-trained embeddings

Bias correction

Note

Contact

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages