Skip to content

facebookresearch/ava-256

Repository files navigation

Ava-256: Universal Encoders and Decoders

Together with Goliath, part of Codec Avatar Studio

We provide

  • 256 paired high-resolution dome and headset captures
  • Code to build universal face encoders and decoders

256_subjects

Compiling extensions

We need to compile the CUDA raymarcher and some utilities. This can be done with

cd extensions/mvpraymarch
make

and

cd extensions/utils
make

Data

We provide a handy multithreaded script to download the dataset from AWS:

python download.py -o <output-dir> -n 1 -j 8

Will download a single capture to <output-dir>, using 8 threads. You may increase n to download more captures (up to 256), and -j to increase the number of threads. Note that, by default, this will download the 4TB dataset. If you want longer or higher quality data (at the expense of more storage), you may pass --size {8,16,32}TB.

Decoder

For every subject, the decoder data includes 80 camera views with camera calibration, head pose in world coordinates, registered mesh, 3d keypoints and semantic segmentations.

decoder_assets.mp4

Encoder

For every subject, the encoder data consists of 5 infrared camera views captured from a Quest Pro.

quest_pro.mp4

For more details on the data format, see Question 4 under the Composition Section of our datasheet.

Train

To train a simple model on a standalone machine you can

  1. Update the config file under configs/config.yaml, specially the dataset_directory to point to the dataset
  2. run python ddp-train.py

Note that you can override any parameter in the config file by passing flags on the cli, eg

python ddp-train.py --train.dataset_dir=<mydir>

will override the train.dataset_dir parameter in the config file.

To train on Avatar RSC, you can use

 ava sync ava-256; SCENV=ava rsc_launcher launch \
  --projects AIRSTORE_AVATAR_RSC_DATA_PIPELINE_CRYPTO \
  -e 'cd ~/rsc/ava-256 && sbatch sbatch.sh'

Visualization

To run visualization of a trained model, you can run

python render.py

This should create a visualization showcasing the consistent expression space of the decoder:

EXP_eyes_blink_light_medium_hard_wink.mp4

Tests

You can run tests with python -m pytest tests/