One pose fits all

This repository contains files used in the thesis done by Yen-Lin Wu in partial fulfillment of his MSc programme in Mechanical Engineering at Delft University of Technology (2021), supervised by Osama Mazhar and Jens Kober. Thesis is available here.

The thesis aims to address the widely challenged computer vision task - 3D Human Pose Estimation. Different from most existing methods, we propose a novel estimating technique that discards convolutional layers, using only Transformer layers. On top of that, we integrate a human kinemtic model that encapsulates bone length and joint angle constraints to improve prediction accuracies. We also propose a new evaluation metric, Mean Per Bone Vector Error (MPBVE), that focuses on human postures, independent of human body shape, age, or gender.

PEBRT (Pose Estimation by Bone Rotation using Transformer)

PEBRT estimates rotation matrix parameters for each bone which are applied to a human kinematic model. Each rotation matrix is recovered by Gram-Schmidt orthogonalization proposed by Zhou et al..

For details of implementation please refer to DOCUMENTATIONS.md.

Quick start

Clone the repository and install required dependencies to proceed.

git clone https://github.com/wuyenlin/pebrt
cd pebrt/
pip3 install -r requirements.txt

Dataset setup

This instruction only focuses on the setup of Human3.6M. For dataset setup, please refer to DATASETS.md. While MPI-INF-3DHP can be used to train, the evaluation on their test set is not implemented here.

Evaluation our pre-trained models

Download pre-trained models from Google Drive. For example, to run evaluation on 4 layers of Transformer encoders:

python3 lift.py --num_layers 4 --eval --checkpoint /path/to/all_4_lay_epoch_latest.bin

Training from scratch

To start training the model with 1 layer of Transformer Encoder, run

python3 lift.py --num_layers 1

If you are running on a SLI enabled machine or computing cluster, run the following Pytorch DDP code (example of using 2 GPUs):

python3 -m torch.distributed.launch --nproc_per_node=2 --nnodes=1 lift.py

Following arguments are available:

--bs : batch size. Default: 2.
--num_layers : number of Transformer encoder layers. Default: 2.
--resume : accepts the path to a trained weight when you want to resume training.
checkpoint : accepts the path to a trained weight when you want to evaluate or visualize the results.
--eval : activate evaluation mode, to be used together with --checkpoint.

Animate results

With (pre-)trained weights, you can visualize and animate the results on our huan model using the code below. Run the following code, where X is the number of Transformer encoder layer:

python3 animation.py --checkpoint /path/to/weights/ --num_layers X --bs 64

The output looks something like below.

Name		Name	Last commit message	Last commit date
Latest commit History 642 Commits
common		common
dataset		dataset
doc		doc
.gitignore		.gitignore
DATASETS.md		DATASETS.md
DOCUMENTATIONS.md		DOCUMENTATIONS.md
LICENSE		LICENSE
README.md		README.md
animation.py		animation.py
lift.py		lift.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

One pose fits all

PEBRT (Pose Estimation by Bone Rotation using Transformer)

Quick start

Dataset setup

Evaluation our pre-trained models

Training from scratch

Animate results

TODO

About

Languages

License

wuyenlin/pebrt

Folders and files

Latest commit

History

Repository files navigation

One pose fits all

PEBRT (Pose Estimation by Bone Rotation using Transformer)

Quick start

Dataset setup

Evaluation our pre-trained models

Training from scratch

Animate results

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Languages