AIR-ASVspoof

One-Class Learning Towards Synthetic Voice Spoofing Detection

| |

This repository contains the official implementation of our SPL paper, "One-class Learning Towards Synthetic Voice Spoofing Detection."

[poster] [slides] [video] [Project webpage]

Updates

[Jun. 2023] We further improved the loss function by proposing SAMO algorithm (Speaker Attractor Multi-Center One-Class Learning) @ ICASSP 2023 (Ding et al. 2023).

[Feb. 2023] We investigated one-class learning more and included new loss functions. Check out the book chapter published in Handbook of Biometric Anti-Spoofing (Zhang et al. 2023).

[Sep. 2021] This version of the code used LFCC+ResNet as the backbone. The LFCC feature was implemented with MATLAB, and ResNet was implemented with PyTorch. If you would like full Python code, please check out our follow-up work @ Interspeech 2021 (Zhang et al. 2021).

Requirements

python==3.6

pytorch==1.1.0

Data Preparation

The LFCC features are extracted with the MATLAB implementation provided by the ASVspoof 2019 organizers. Please first run the process_LA_data.m with MATLAB, and then run python3 reload_data.py with python. Make sure you change the directory path to the path on your machine.

Run the training code

Before running the train.py, please change the path_to_database, path_to_features, path_to_protocol according to the files' location on your machine.

python3 train.py --add_loss ocsoftmax -o ./models/ocsoftmax --gpu 0

Run the test code with trained model

You can change the model_dir to the location of the model you would like to test with.

python3 test.py -m ./models/ocsoftmax -l ocsoftmax --gpu 0

Citation

@ARTICLE{zhang2021one,
  author={Zhang, You and Jiang, Fei and Duan, Zhiyao},
  journal={IEEE Signal Processing Letters}, 
  title={One-Class Learning Towards Synthetic Voice Spoofing Detection}, 
  year={2021},
  volume={28},
  number={},
  pages={937-941},
  abstract={Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion. Recently, researchers developed anti-spoofing techniques to improve the reliability of ASV systems against spoofing attacks. However, most methods encounter difficulties in detecting unknown attacks in practical use, which often have different statistical distributions from known attacks. Especially, the fast development of synthetic voice spoofing algorithms is generating increasingly powerful attacks, putting the ASV systems at risk of unseen attacks. In this work, we propose an anti-spoofing system to detect unknown synthetic voice spoofing attacks (i.e., text-to-speech or voice conversion) using one-class learning. The key idea is to compact the bona fide speech representation and inject an angular margin to separate the spoofing attacks in the embedding space. Without resorting to any data augmentation methods, our proposed system achieves an equal error rate (EER) of 2.19% on the evaluation set of ASVspoof 2019 Challenge logical access scenario, outperforming all existing single systems (i.e., those without model ensemble).},
  keywords={},
  doi={10.1109/LSP.2021.3076358},
  ISSN={1558-2361},
  month={},}

Follow-up works

Please check out our follow-up work:

[1] Zhang, Y., Zhu, G., Jiang, F., Duan, Z. (2021) An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems. Proc. Interspeech 2021, 4309-4313, doi: 10.21437/Interspeech.2021-1820 [link] [arXiv] [code] [video]

[2] Chen, X., Zhang, Y., Zhu, G., Duan, Z. (2021) UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021. Proc. 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, 75-82, doi: 10.21437/ASVSPOOF.2021-12 [link] [arXiv] [code] [video]

[3] Zhang, Y., Jiang, F., Zhu, G., Chen, X., & Duan, Z. (2023). Generalizing Voice Presentation Attack Detection to Unseen Synthetic Attacks and Channel Variation. In Handbook of Biometric Anti-Spoofing: Presentation Attack Detection and Vulnerability Assessment (pp. 421-443). [link] [code]

[4] Ding, S., Zhang, Y., & Duan, Z. (2023). Samo: Speaker attractor multi-center one-class learning for voice anti-spoofing. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). [link] [arXiv] [code] [video]

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
models1028		models1028
LICENSE		LICENSE
README.md		README.md
Visualization.ipynb		Visualization.ipynb
dataset.py		dataset.py
deltas.m		deltas.m
eval_metrics.py		eval_metrics.py
evaluate_tDCF_asvspoof19.py		evaluate_tDCF_asvspoof19.py
extract_lfcc.m		extract_lfcc.m
loss.py		loss.py
process_LA_data.m		process_LA_data.m
reload_data.py		reload_data.py
resnet.py		resnet.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIR-ASVspoof

One-Class Learning Towards Synthetic Voice Spoofing Detection

Updates

Requirements

Data Preparation

Run the training code

Run the test code with trained model

Citation

Follow-up works

About

Releases

Packages

Contributors 3

Languages

License

yzyouzhang/AIR-ASVspoof

Folders and files

Latest commit

History

Repository files navigation

AIR-ASVspoof

One-Class Learning Towards Synthetic Voice Spoofing Detection

Updates

Requirements

Data Preparation

Run the training code

Run the test code with trained model

Citation

Follow-up works

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages