This repository contains the official implementation of the paper:
"Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection" (NAACL Findings 2025).
[Paper]
This work conducts a systematic layer-wise analysis of self-supervised learning (SSL) models (e.g., Wav2Vec2, Hubert, WavLM) for detecting audio deepfakes across multilingual datasets (English, Chinese, Spanish), partial/song/scene-based deepfakes, and varying acoustic conditions. Key findings include:
- Lower layers (1-6 for small models, 1-12 for large models) provide the most discriminative features for detection.
- Reduced-layer models achieve competitive performance while lowering computational costs.
- Results generalize across languages (En, Zh, Es) and deepfake scenarios (full/partial/song/scene).
- Clone the repository:
git clone https://github.com/Yaselley/SSL_Layerwise_Deepfake cd SSL_Layerwise_Deepfake
To install the required dependencies, run the following command:
pip install -r requirements.txt
Publicly available datasets used in this work:
- ASVspoof 2019 (LA19): Link
- ASVspoof 2021 (LA21/DF21): Link
- ADD23: Link
- HABLA: Link
- PartialSpoof: Link
- SceneFake: Link
- CtrSVDD: Link
Each dataset has a dedicated script. Example for ASVspoof 2019 (LA19):
python main.py \
--seed 42 \
--model w2v \ # Options: w2v, hubert, wavlm
--small \ # Use small SSL model variant
--n_layers 6 \ # Number of transformer layers (e.g., 6 for small models)
--back FFN # Backend classifier: FFN or AASIST
- Scene-based deepfakes:
main_sceneFake.py
- Spanish (HABLA):
main_spanish.py
- Partial deepfakes:
main_partial.py
- Heatmaps (layer-wise weights) and tables are generated in the
analysis/
folder.
- Lower SSL layers are critical for detecting artifacts in synthetic audio.
- Reducing layers (e.g., 4-6 for small models) achieves competitive performance while improving inference speed by 2-3×.
- Models generalize across languages and deepfake types, offering practical solutions for real-world deployment.
@misc{kheir2025comprehensivelayerwiseanalysisssl,
title={Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection},
author={Yassine El Kheir and Youness Samih and Suraj Maharjan and Tim Polzehl and Sebastian Möller},
year={2025},
eprint={2502.03559},
archivePrefix={arXiv},
primaryClass={eess.AS},
url={https://arxiv.org/abs/2502.03559},
}