GitHub - facebookresearch/mae_st: Official Open Source code for "Masked Autoencoders As Spatiotemporal Learners"

Masked Autoencoders As Spatiotemporal Learners: A PyTorch Implementation

This is a PyTorch/GPU re-implementation of the paper Masked Autoencoders As Spatiotemporal Learners:

@Article{MaskedAutoencodersSpatiotemporal2022,
  author  = {Christoph Feichtenhofer and Haoqi Fan and Yanghao Li and Kaiming He},
  journal = {arXiv:2205.09113},
  title   = {Masked Autoencoders As Spatiotemporal Learners},
  year    = {2022},
}

Another implementation that supports AVA and SSv2 downstream evaluation is available in PySlowFast.

This repo is a modification on the MAE repo. Installation and preparation follow INSTALL.md.
This repo is based on timm==0.3.2, for which a fix is needed to work with PyTorch 1.8.1+.

Catalog

Visualization demo
Pre-trained checkpoints + fine-tuning code + testing code
Pre-training code

Visualization demo

Visualization of MAE output with 95% (left) and 98% (right) mask rate on the same video.

Run our interactive visualization demo using Colab notebook (no GPU needed):

Fine-tuning with pre-trained checkpoints

The following table provides the pre-trained checkpoints used in the paper, pretrained with 90% mask ratio and 1600 effective epochs, converted from the PySlowFast codebase:

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-400	download	download
md5	`edf3a5`	`3d7f64`

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-600	download	download
md5	`9a9645`	`27495e`

	ViT-Large	ViT-Huge
pre-trained checkpoint on Kinetics-700	download	download
md5	`cdbada`	`4c4e3c`

The fine-tuning instruction is in FINETUNE.md.

Pre-training

The pre-training instruction is in PRETRAIN.md.

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
demo		demo
util		util
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DATASET.md		DATASET.md
FINETUNE.md		FINETUNE.md
LICENSE		LICENSE
PRETRAIN.md		PRETRAIN.md
README.md		README.md
engine_finetune.py		engine_finetune.py
engine_pretrain.py		engine_pretrain.py
engine_test.py		engine_test.py
launch_fb_flow.sh		launch_fb_flow.sh
launch_fb_local.sh		launch_fb_local.sh
launch_fb_tensorboard.sh		launch_fb_tensorboard.sh
main_finetune.py		main_finetune.py
main_pretrain.py		main_pretrain.py
main_test.py		main_test.py
models_mae.py		models_mae.py
models_vit.py		models_vit.py
run_finetune.py		run_finetune.py
run_pretrain.py		run_pretrain.py
run_test.py		run_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Masked Autoencoders As Spatiotemporal Learners: A PyTorch Implementation

Catalog

Visualization demo

Fine-tuning with pre-trained checkpoints

Pre-training

License

About

Releases

Packages

Contributors 5

Languages

License

facebookresearch/mae_st

Folders and files

Latest commit

History

Repository files navigation

Masked Autoencoders As Spatiotemporal Learners: A PyTorch Implementation

Catalog

Visualization demo

Fine-tuning with pre-trained checkpoints

Pre-training

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages