Enhancing Feature Fusion for Human Pose Estimation

A new method to fuse high-level features and low-level features in human pose estimation

Introduction

This code refers to SimpleBaseline: https://github.com/microsoft/human-pose-estimation.pytorch. we use Semantic Embedding Block (SEB) and Global Convolutional Network (GCN) blocks to bridge the gap between low-level and high-level features. Experiments on MPII and LSP human pose estimation datasets demonstrate that efficient feature fusion can significantly improve the performance.

Results on MPII val

Method	Input	Head	Shoulder	Elbow	Wrist	Hip	Knee	Ankle	Mean
SimpleBaseline_ResNet50	256x256	96.35	95.33	88.99	83.18	88.42	83.96	79.59	88.53
ours	256x256	96.73	95.35	89.50	83.73	88.23	84.43	79.92	88.82
SimpleBaseline_ResNet50	384x384	96.66	95.75	89.79	84.61	88.52	84.67	79.29	89.07
ours	384x384	96.67	95.75	90.05	85.58	88.85	84.73	79.74	89.35

Environment

python >= 3.6
pytorch >= 1.0.0

Quick start

Download the dataset and pretrained model, you can follow the an official pytorch implementation of SimpleBaseline.
Training the model:

python pose_estimation/train.py \
    --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml

valid the model:

python pose_estimation/valid.py \
    --cfg experiments/mpii/resnet50/256x256_d256x3_adam_lr1e-3.yaml \
    --flip-test \
    --model-file models/pytorch/pose_mpii/pose_resnet_50_256x256.pth.tar

Future work

look forward to multi-scale feature fusion structures.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Enhancing Feature Fusion for Human Pose Estimation

Introduction

Results on MPII val

Environment

Quick start

Future work

Files

README.md

Latest commit

History

README.md

File metadata and controls

Enhancing Feature Fusion for Human Pose Estimation

Introduction

Results on MPII val

Environment

Quick start

Future work