Zalo AI Challenge - Voice Verification

This repository contains the framework for training speaker verification model described in [2]
with score normalization post-processing described in [3].

Dependencies

pip install -r requirements.txt

Data Preparation

Download the public dataset then put the training speakers data in dataset/wavs and public-test folder in dataset/public-test
Convert data (this will overwrite original data)

python dataprep.py --save_path dataset/wavs --convert

Prepare the augment data

python dataprep.py --save_path dataset --augment

Generate train, validate list

python dataprep.py --save_path dataset/wavs --generate --split_ratio -1

In addition to the Python dependencies, wget and ffmpeg must be installed on the system.

Pretrained models

Pretrained models and corresponding cohorts can be downloaded from here.

Training

Training from pretrained with augmentation for 500 epochs.

python train.py --augment --max_epoch 500 --batch_size 320 --initial_model checkpoints/baseline_v2_ap.model

Inference

Prepare cohorts

python inference.py --prepare --save_path checkpoints/cohorts_final_500_f100.npy --initial_model checkpoints/final_500.model

Evaluate and tune thresholds

python inference.py --eval --cohorts_path checkpoints/cohorts_final_500_f100.npy --initial_model checkpoints/final_500.model

Run on test set

python inference.py --test --cohorts_path checkpoints/cohorts_final_500_f100.npy --test_threshold 1.7206447124481201 --test_path dataset --initial_model checkpoints/final_500.model

Citation

[1] In defence of metric learning for speaker recognition

@inproceedings{chung2020in,
    title={In defence of metric learning for speaker recognition},
    author={Chung, Joon Son and Huh, Jaesung and Mun, Seongkyu and Lee, Minjae and Heo, Hee Soo and Choe, Soyeon and Ham, Chiheon and Jung, Sunghwan and Lee, Bong-Jin and Han, Icksang},
    booktitle={Interspeech},
    year={2020}
}

[2] Clova baseline system for the VoxCeleb Speaker Recognition Challenge 2020

@article{heo2020clova,
    title={Clova baseline system for the {VoxCeleb} Speaker Recognition Challenge 2020},
    author={Heo, Hee Soo and Lee, Bong-Jin and Huh, Jaesung and Chung, Joon Son},
    journal={arXiv preprint arXiv:2009.14153},
    year={2020}
}

[3] Analysis of score normalization in multilingual speaker recognition

@inproceedings{inproceedings,
    title = {Analysis of Score Normalization in Multilingual Speaker Recognition},
    author = {Matejka, Pavel and Novotny, Ondrej and Plchot, Oldřich and Burget, Lukas and Diez, Mireia and Černocký, Jan},
    booktitle = {Interspeech},
    year = {2017}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dataset		dataset
loss		loss
models		models
optimizer		optimizer
scheduler		scheduler
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE.clovaai		LICENSE.clovaai
README.md		README.md
dataprep.py		dataprep.py
inference.py		inference.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Zalo AI Challenge - Voice Verification

Dependencies

Data Preparation

Pretrained models

Training

Inference

Citation

About

Licenses found

Releases

Packages

Languages

License

Licenses found

nghiapq77/voice-verification

Folders and files

Latest commit

History

Repository files navigation

Zalo AI Challenge - Voice Verification

Dependencies

Data Preparation

Pretrained models

Training

Inference

Citation

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages