A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
Pytorch code for following paper:
-
Title : HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods
- Accepted at ICASSP 2024 (change URL later): https://arxiv.org/pdf/2309.08208.pdf
-
Autor : Hyun-seo Shin*, Jungwoo Heo*, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-Jin Yu
Audio deepfake detection (ADD) is the task of detecting spoofing attacks generated by text-to-speech or voice conversion systems. Spoofing evidence, which helps to distinguish between spoofed and bona-fide utterances, might exist either locally or globally in the input features. To capture these, the Conformer, which consists of Transformers and CNN, possesses a suitable structure. However, since the Conformer was designed for sequence-to-sequence tasks, its direct application to ADD tasks may be sub-optimal. To tackle this limitation, we propose HM-Conformer by adopting two components: (1) Hierarchical pooling method progressively reducing the sequence length to eliminate duplicated information (2) Multi-level classification token aggregation method utilizing classification tokens to gather information from different blocks. Owing to these components, HM-Conformer can efficiently detect spoofing evidence by processing various sequence lengths and aggregating them. In experimental results on the ASVspoof 2021 Deepfake dataset, HM-Conformer achieved a 15.71% EER, showing competitive performance compared to recent systems.
We used ASVspoof 2019 LA and ASVspoof 2021 DF datasets to train and evaluate our proposed method.
Once you have datasets ready, you should run the code below to augment the data.
- Write your path in
data_prepare.py
andmake_metadata.py
# data_prepare.py line 117
# you need to write your path of ASVspoof 2019
YOUR_ASVspoof2019_PATH = {YOUR_ASVspoof2019_PATH} # '/ASVspoof2019'
path_train = YOUR_ASVspoof2019_PATH + '/LA/ASVspoof2019_LA_train'
# make_metadata.py line 51
YOUR_ASVspoof2019_PATH = {YOUR_ASVspoof2019_PATH} # '/ASVspoof2019'
path_meta = '/LA/ASVspoof2019_LA_cm_protocols/ASVspoof2019.LA.cm.dev.trl.txt'
- Run
data_prepare.py
andmake_metadata.py
in terminal
# One by one
python data_prepare.py
python make_metadata.py
-
We used 'nvcr.io/nvidia/pytorch:22.08-py3' image of Nvidia GPU Cloud for conducting our experiments.
-
Python 3.8.12
-
Pytorch 1.13.0+cu117
-
Torchaudio 0.13.0+cu117
-
ffmpeg
- Make dockerfile image
# build docker img
# run at /~/HM-Conformer
./docker/build.sh
- Run docker image
sudo docker run --gpus all -it --rm --ipc=host -v {PATH_DB}:/data -v \
{PATH_HM-Conformer}/env202305:/environment -v \
{PATH_HM-Conformer}/env202305/results:/results -v \
{PATH_HM-Conformer}/exp_lib:/exp_lib -v \
{PATH_HM-Conformer}:/code env202305:latest
# CAUTION! You need to write your path
# PATH_DB
# |- ASVspoof2019
# | |- LA
# |- ASVspoof2021_DF
# |- ASVspoof2021_DF_eval
# |- keys
First, you need to set system arguments. You can set arguments in arguments.py
. Here is list of system arguments to set.
1. 'usable_gpu' : {Available_GPUs}
'usable_gpu' is the order of the GPUs you have available.
input type is str # ex) '0,1'
CAUTION! You need to use 2 or more GPUs
2. 'TEST' : True or False
'TEST' is the factor that determines whether you use inference or learning.
Set it to 'True' if you only want to infer, or 'False' if you want to train.
input type is bool
We have a basic logger that stores information in local. However, if you would like to use an additional online logger (wandb or neptune):
- In
arguments.py
# Wandb: Add 'wandb_user' and 'wandb_token'
# Neptune: Add 'neptune_user' and 'neptune_token'
# input this arguments in "system_args" dictionary:
# for example
'wandb_group' : 'group',
'wandb_entity' : 'user-name',
'wandb_api_key' : 'WANDB_TOKEN',
'neptune_user' : 'user-name',
'neptune_token' : 'NEPTUNE_TOKEN'
- In
main.py
# Just remove "#" in logger
# logger
builder = egg_exp.log.LoggerList.Builder(args['name'], args['project'], args['tags'],
args['description'], args['path_scripts'], args)
builder.use_local_logger(args['path_log'])
# builder.use_neptune_logger(args['neptune_user'], args['neptune_token'])
# builder.use_wandb_logger(args['wandb_entity'], args['wandb_api_key'],
# args['wandb_group'])
logger = builder.build()
logger.log_arguments(experiment_args)
Run main.py in docker container.
python /code/hm_conformer/main.py
Please cite this paper if you make use of the code.
# add later...