This repository contains evaluation scripts and pretrained models for VoxMovies dataset.
pip install -r requirements.txt
- VoxMovies dataset and test pairs can be downloaded from here.
- For e2, e3 and e4 test pairs, you also need both VoxCeleb1 dev and test set. Please download them from here.
- You need to put VoxCeleb1 and VoxMovies into args.test_path. We've used symlink for this.
- Please note that PATH_TO_VOXCELEB1 directory needs to contain both VoxCeleb1 dev & test wavfiles.
mkdir data
cd data
ln -s PATH_TO_VOXCELEB1 voxceleb1
ln -s PATH_TO_VOXMOVIES_TEST_SET voxmovies_test
Then, run the script below.
python eval.py --initial_model PATH_TO_PRETRAINED_MODEL --test_list PATH_TO_TEST_PAIRS --test_path data/
- Both pretrained model and finetuned model are provided below.
- Note that baseline_v2_ap.model has already been publicly available. Please refer to here for more details about model architecture and training procedures.
Evaluation results (EER %) with VoxMovies test sets
Test pairs | e1 | e2 | e3 | e4 | e5 |
---|---|---|---|---|---|
baseline_v2_ap.model | 6.09 | 7.40 | 7.50 | 9.23 | 10.47 |
finetuned.model | 5.76 | 7.10 | 8.36 | 7.37 | 9.55 |
This evaluation code is largely based on clova voxceleb trainer. Please refer to this repo if you want to train/finetune the model.
If you make use of this code, kindly reference:
@InProceedings{Brown21b,
title={Playing a Part: Speaker Verification at the Movies},
author={Andrew Brown and Jaesung Huh and Arsha Nagrani and Joon Son Chung and Andrew Zisserman},
year={2021},
booktitle={International Conference on Acoustics, Speech, and Signal Processing (ICASSP)}
}