About this repository

Unofficial pytorch implementation Self-Supervised Learning of Audio-Visual Objects from Video[project page]

Installation

Requirements

Linux
Python 3.6+
PyTorch 1.8.1 or higher and CUDA

a. Create a conda virtual environment and activate it.

conda create -n avobject_torch python=3.6
conda activate avobject_torch

b. Install PyTorch and torchvision following the official instructions

c. Clone this repository.

git clone https://github.com/yw0nam/avobject_torch/
cd avobject_torch

d. Install requirments.

pip install -r requirements.txt

Dataset

Trained by LRS3 or LRS2

data	training sample	validation sample
LRS2	72052	158
LRS3	88520	408

Training

a. run makefile_ls.py to generate dev.txt, test.txt

python makefile_ls.py --root_dir dataset_root

b. Run training code(you can change the parameter, check the argparser in train.py)

python train.py

Prediction

Not implement yet, It will be released soon.

Result

Note that, This repository is ongoing project.

I'm still training this model, and implement downstream work(like Active speaker detection, Sound source seperation)

data	train loss	validation loss	epoch
LRS2	0.234909	0.065351	6
LRS3	0.311373	0.208642	3

Here is model prediction result trained by LRS2.

Thanks to

The repository is based on syncnet_trainer and avobject.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
demo		demo
Data_generator.py		Data_generator.py
README.md		README.md
avobject.py		avobject.py
makefile_ls.py		makefile_ls.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this repository

Installation

Requirements

Dataset

Training

Prediction

Result

Thanks to

About

Releases

Packages

Languages

yw0nam/avobject_torch

Folders and files

Latest commit

History

Repository files navigation

About this repository

Installation

Requirements

Dataset

Training

Prediction

Result

Thanks to

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages