[Feature] Support Halpe26 keypoints. (#25)

* Update TODO * TODO * Support Halpe26 * Delete README.md * Revert original README * Update Readme and tiny other edits * Update rtmlib/tools/solution/body_and_feet.py Co-authored-by: Tau <[email protected]> * Rename body+feet_demo.py to body_with_feet_demo.py * Rename body_and_feet.py to body_with_feet.py * Rename * Fix typo * Update README.md Co-authored-by: Tau <[email protected]> * Update README.md Co-authored-by: Tau <[email protected]> --------- Co-authored-by: davidpagnon <[email protected]> Co-authored-by: Tau <[email protected]>
Tau-J · Jul 3, 2024 · a9d8c5e · a9d8c5e
1 parent da0d3c3
commit a9d8c5e
Show file tree

Hide file tree

Showing 10 changed files with 270 additions and 10 deletions.
diff --git a/README.md b/README.md
@@ -90,13 +90,15 @@ python webui.py
 - Solutions (High-level APIs)
   - [Wholebody](/rtmlib/tools/solution/wholebody.py)
   - [Body](/rtmlib/tools/solution/body.py)
+  - [Body_with_feet](/rtmlib/tools/solution/body_with_feet.py)
   - [Hand](/rtmlib/tools/solution/hand.py)
   - [PoseTracker](/rtmlib/tools/solution/pose_tracker.py)
 - Models (Low-level APIs)
   - [YOLOX](/rtmlib/tools/object_detection/yolox.py)
   - [RTMDet](/rtmlib/tools/object_detection/rtmdet.py)
   - [RTMPose](/rtmlib/tools/pose_estimation/rtmpose.py)
     - RTMPose for 17 keypoints
+    - RTMPose for 26 keypoints
     - RTMW for 133 keypoints
     - DWPose for 133 keypoints
     - RTMO for one-stage pose estimation (17 keypoints)
@@ -180,10 +182,25 @@ Notes:
 
 </details>
 
+<details open>
+<summary><b>Body 26 Keypoints</b></summary>
+
+|                                                                     ONNX Model                                                                      | Input Size | AUC (Body8) |      Description      |
+| :-------------------------------------------------------------------------------------------------------------------------------------------------: | :--------: | :-------: | :-------------------: |
+| [RTMPose-t](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-t_simcc-body7_pt-body7-halpe26_700e-256x192-6020f8a6_20230605.zip) |  256x192   |   66.35    | trained on 7 datasets |
+| [RTMPose-s](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-s_simcc-body7_pt-body7-halpe26_700e-256x192-7f134165_20230605.zip) |  256x192   |   68.62    | trained on 7 datasets |
+| [RTMPose-m](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7-halpe26_700e-256x192-4d3e73dd_20230605.zip) |  256x192   |   71.91    | trained on 7 datasets |
+| [RTMPose-l](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-l_simcc-body7_pt-body7-halpe26_700e-256x192-2abb7558_20230605.zip) |  256x192   |   73.19    | trained on 7 datasets |
+| [RTMPose-m](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7-halpe26_700e-384x288-89e6428b_20230605.zip) |  384x288   |   73.56    | trained on 7 datasets |
+| [RTMPose-l](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-l_simcc-body7_pt-body7-halpe26_700e-384x288-734182ce_20230605.zip) |  384x288   |   74.38    | trained on 7 datasets |
+| [RTMPose-x](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-x_simcc-body7_pt-body7-halpe26_700e-384x288-7fb6e239_20230606.zip) |  384x288   |   74.82    | trained on 7 datasets |
+
+</details>
+
 <details open>
 <summary><b>WholeBody 133 Keypoints</b></summary>
 
-|                                                                     ONNX Model                                                                     | Input Size |      |           Description           |
+|                                                                     ONNX Model                                                                     | Input Size |   AP (Whole)   |           Description           |
 | :------------------------------------------------------------------------------------------------------------------------------------------------: | :--------: | :--: | :-----------------------------: |
 | [DWPose-t](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-t_simcc-ucoco_dw-ucoco_270e-256x192-dcf277bf_20230728.zip) |  256x192   | 48.5 | trained on COCO-Wholebody+UBody |
 | [DWPose-s](https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-s_simcc-ucoco_dw-ucoco_270e-256x192-3fd922c8_20230728.zip) |  256x192   | 53.8 | trained on COCO-Wholebody+UBody |

diff --git a/body_with_feet_demo.py b/body_with_feet_demo.py
@@ -0,0 +1,50 @@
+import time
+import cv2
+from rtmlib import BodyWithFeet, PoseTracker, draw_skeleton
+
+device = 'cpu'
+backend = 'onnxruntime'  # opencv, onnxruntime, openvino
+
+cap = cv2.VideoCapture(0)  # Video file path
+
+openpose_skeleton = False  # True for openpose-style, False for mmpose-style
+
+body_feet_tracker = PoseTracker(
+    BodyWithFeet,
+    det_frequency=7,
+    to_openpose=openpose_skeleton,
+    mode='performance',  # balanced, performance, lightweight
+    backend=backend,
+    device=device)
+
+frame_idx = 0
+
+while cap.isOpened():
+    success, frame = cap.read()
+    frame_idx += 1
+
+    if not success:
+        break
+    s = time.time()
+    keypoints, scores = body_feet_tracker(frame)
+    det_time = time.time() - s
+    print('det: ', det_time)
+
+    img_show = frame.copy()
+
+    img_show = draw_skeleton(img_show,
+                             keypoints,
+                             scores,
+                             openpose_skeleton=openpose_skeleton,
+                             kpt_thr=0.3,
+                             line_width=3)
+
+    img_show = cv2.resize(img_show, (960, 640))
+    while True:
+        cv2.imshow('Body and Feet Pose Estimation', img_show)
+        key = cv2.waitKey(1) & 0xFF
+        if key == ord('q'): # Press 'q' to exit
+            break
+
+cap.release()
+cv2.destroyAllWindows()
diff --git a/rtmlib/__init__.py b/rtmlib/__init__.py
@@ -1,8 +1,8 @@
 from .tools import (RTMO, YOLOX, Body, Hand, PoseTracker, RTMDet, RTMPose,
-                    Wholebody)
+                    Wholebody, BodyWithFeet)
 from .visualization.draw import draw_bbox, draw_skeleton
 
 __all__ = [
     'RTMDet', 'RTMPose', 'YOLOX', 'Wholebody', 'Body', 'draw_skeleton',
-    'draw_bbox', 'PoseTracker', 'Hand', 'RTMO'
+    'draw_bbox', 'PoseTracker', 'Hand', 'RTMO', 'BodyWithFeet'
 ]
diff --git a/rtmlib/tools/__init__.py b/rtmlib/tools/__init__.py
@@ -1,8 +1,8 @@
 from .object_detection import YOLOX, RTMDet
 from .pose_estimation import RTMO, RTMPose
-from .solution import Body, Hand, PoseTracker, Wholebody
+from .solution import Body, Hand, PoseTracker, Wholebody, BodyWithFeet
 
 __all__ = [
     'RTMDet', 'RTMPose', 'YOLOX', 'Wholebody', 'Body', 'Hand', 'PoseTracker',
-    'RTMO'
+    'RTMO', 'BodyWithFeet'
 ]
diff --git a/rtmlib/tools/solution/__init__.py b/rtmlib/tools/solution/__init__.py
@@ -2,5 +2,6 @@
 from .hand import Hand
 from .pose_tracker import PoseTracker
 from .wholebody import Wholebody
+from .body_with_feet import BodyWithFeet
 
-__all__ = ['Wholebody', 'Body', 'PoseTracker', 'Hand']
+__all__ = ['Wholebody', 'Body', 'PoseTracker', 'Hand', 'BodyWithFeet']
diff --git a/rtmlib/tools/solution/body_with_feet.py b/rtmlib/tools/solution/body_with_feet.py
@@ -0,0 +1,128 @@
+'''
+Example:
+
+import cv2
+from rtmlib import Halpe26, draw_skeleton
+
+device = 'cuda'
+backend = 'onnxruntime'  # opencv, onnxruntime
+
+cap = cv2.VideoCapture('./demo.mp4')
+
+to_openpose = True  # True for openpose-style, False for mmpose-style
+
+halpe26 = Halpe26(to_openpose=to_openpose,
+                  backend=backend,
+                  device=device)
+
+frame_idx = 0
+
+while cap.isOpened():
+    success, frame = cap.read()
+    frame_idx += 1
+
+    if not success:
+        break
+
+    keypoints, scores = halpe26(frame)
+
+    img_show = frame.copy()
+
+    img_show = draw_skeleton(img_show,
+                             keypoints,
+                             scores,
+                             openpose_skeleton=openpose_skeleton,
+                             kpt_thr=0.43)
+
+    img_show = cv2.resize(img_show, (960, 540))
+    cv2.imshow('img', img_show)
+    cv2.waitKey(10)
+
+'''
+
+import numpy as np
+
+class BodyWithFeet:
+    """
+    Halpe26 class for human pose estimation using the Halpe26 keypoint format.
+    This class supports different modes of operation and can output in OpenPose format.
+    """
+
+    MODE = {
+        'performance': {
+            'det': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_x_8xb8-300e_humanart-a39d44ed.zip',
+            'det_input_size': (640, 640),
+            'pose': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-x_simcc-body7_pt-body7-halpe26_700e-384x288-7fb6e239_20230606.zip',
+            'pose_input_size': (288, 384),
+        },
+        'lightweight': {
+            'det': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_tiny_8xb8-300e_humanart-6f3252f9.zip',
+            'det_input_size': (416, 416),
+            'pose': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-s_simcc-body7_pt-body7-halpe26_700e-256x192-7f134165_20230605.zip',
+            'pose_input_size': (192, 256),
+        },
+        'balanced': {
+            'det': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/yolox_m_8xb8-300e_humanart-c2c7a14a.zip',
+            'det_input_size': (640, 640),
+            'pose': 'https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/onnx_sdk/rtmpose-m_simcc-body7_pt-body7-halpe26_700e-256x192-4d3e73dd_20230605.zip',
+            'pose_input_size': (192, 256),
+        }
+    }
+
+    def __init__(self,
+                 det: str = None,
+                 det_input_size: tuple = (640, 640),
+                 pose: str = None,
+                 pose_input_size: tuple = (192, 256),
+                 mode: str = 'balanced',
+                 to_openpose: bool = False,
+                 backend: str = 'onnxruntime',
+                 device: str = 'cpu'):
+        """
+        Initialize the Halpe26 pose estimation model.
+
+        Args:
+            det (str, optional): Path to detection model. If None, uses default based on mode.
+            det_input_size (tuple, optional): Input size for detection model. Default is (640, 640).
+            pose (str, optional): Path to pose estimation model. If None, uses default based on mode.
+            pose_input_size (tuple, optional): Input size for pose model. Default is (192, 256).
+            mode (str, optional): Operation mode ('performance', 'lightweight', or 'balanced'). Default is 'balanced'.
+            to_openpose (bool, optional): Whether to convert output to OpenPose format. Default is False.
+            backend (str, optional): Backend for inference ('onnxruntime' or 'opencv'). Default is 'onnxruntime'.
+            device (str, optional): Device for inference ('cpu' or 'cuda'). Default is 'cpu'.
+        """
+        from .. import YOLOX, RTMPose
+
+        if pose is None:
+            pose = self.MODE[mode]['pose']
+            pose_input_size = self.MODE[mode]['pose_input_size']
+
+        if det is None:
+            det = self.MODE[mode]['det']
+            det_input_size = self.MODE[mode]['det_input_size']
+
+        self.det_model = YOLOX(det,
+                               model_input_size=det_input_size,
+                               backend=backend,
+                               device=device)
+        self.pose_model = RTMPose(pose,
+                                  model_input_size=pose_input_size,
+                                  to_openpose=to_openpose,
+                                  backend=backend,
+                                  device=device)
+
+    def __call__(self, image: np.ndarray):
+        """
+        Perform pose estimation on the input image.
+
+        Args:
+            image (np.ndarray): Input image for pose estimation.
+
+        Returns:
+            tuple: A tuple containing:
+                - keypoints (np.ndarray): Estimated keypoint coordinates.
+                - scores (np.ndarray): Confidence scores for each keypoint.
+        """
+        bboxes = self.det_model(image)
+        keypoints, scores = self.pose_model(image, bboxes=bboxes)
+        return keypoints, scores
diff --git a/rtmlib/visualization/__init__.py b/rtmlib/visualization/__init__.py
@@ -1,7 +1,7 @@
 from .draw import draw_bbox, draw_skeleton
-from .skeleton import coco17, coco133, hand21, openpose18, openpose134
+from .skeleton import coco17, coco133, hand21, openpose18, openpose134, halpe26
 
 __all__ = [
     'draw_skeleton', 'draw_bbox', 'coco17', 'coco133', 'hand21', 'openpose18',
-    'openpose134'
+    'openpose134', 'halpe26'
 ]
diff --git a/rtmlib/visualization/draw.py b/rtmlib/visualization/draw.py
@@ -27,6 +27,8 @@ def draw_skeleton(img,
             skeleton = 'openpose18'
         elif num_keypoints == 134:
             skeleton = 'openpose134'
+        elif num_keypoints == 26:
+            skeleton = 'halpe26'
         else:
             raise NotImplementedError
     else:
@@ -36,6 +38,8 @@ def draw_skeleton(img,
             skeleton = 'coco133'
         elif num_keypoints == 21:
             skeleton = 'hand21'
+        elif num_keypoints == 26:
+            skeleton = 'halpe26'
         else:
             raise NotImplementedError
 
@@ -48,7 +52,7 @@ def draw_skeleton(img,
         scores = scores[None, :, :]
 
     num_instance = keypoints.shape[0]
-    if skeleton in ['coco17', 'coco133', 'hand21']:
+    if skeleton in ['coco17', 'coco133', 'hand21', 'halpe26']:
         for i in range(num_instance):
             img = draw_mmpose(img, keypoints[i], scores[i], keypoint_info,
                               skeleton_info, kpt_thr, radius, line_width)

diff --git a/rtmlib/visualization/skeleton/__init__.py b/rtmlib/visualization/skeleton/__init__.py
@@ -3,5 +3,6 @@
 from .hand21 import hand21
 from .openpose18 import openpose18
 from .openpose134 import openpose134
+from .halpe26 import halpe26
 
-__all__ = ['coco17', 'openpose18', 'coco133', 'openpose134', 'hand21']
+__all__ = ['coco17', 'openpose18', 'coco133', 'openpose134', 'hand21', 'halpe26']
diff --git a/rtmlib/visualization/skeleton/halpe26.py b/rtmlib/visualization/skeleton/halpe26.py
@@ -0,0 +1,59 @@
+halpe26 = dict(name='halpe26',         
+    keypoint_info={
+        0: dict(name='nose', id=0, color=[51, 153, 255], type='upper', swap=''),
+        1: dict(name='left_eye', id=1, color=[51, 153, 255], type='upper', swap='right_eye'),
+        2: dict(name='right_eye', id=2, color=[51, 153, 255], type='upper', swap='left_eye'),
+        3: dict(name='left_ear', id=3, color=[51, 153, 255], type='upper', swap='right_ear'),
+        4: dict(name='right_ear', id=4, color=[51, 153, 255], type='upper', swap='left_ear'),
+        5: dict(name='left_shoulder', id=5, color=[0, 255, 0], type='upper', swap='right_shoulder'),
+        6: dict(name='right_shoulder', id=6, color=[255, 128, 0], type='upper', swap='left_shoulder'),
+        7: dict(name='left_elbow', id=7, color=[0, 255, 0], type='upper', swap='right_elbow'),
+        8: dict(name='right_elbow', id=8, color=[255, 128, 0], type='upper', swap='left_elbow'),
+        9: dict(name='left_wrist', id=9, color=[0, 255, 0], type='upper', swap='right_wrist'),
+        10: dict(name='right_wrist', id=10, color=[255, 128, 0], type='upper', swap='left_wrist'),
+        11: dict(name='left_hip', id=11, color=[0, 255, 0], type='lower', swap='right_hip'),
+        12: dict(name='right_hip', id=12, color=[255, 128, 0], type='lower', swap='left_hip'),
+        13: dict(name='left_knee', id=13, color=[0, 255, 0], type='lower', swap='right_knee'),
+        14: dict(name='right_knee', id=14, color=[255, 128, 0], type='lower', swap='left_knee'),
+        15: dict(name='left_ankle', id=15, color=[0, 255, 0], type='lower', swap='right_ankle'),
+        16: dict(name='right_ankle', id=16, color=[255, 128, 0], type='lower', swap='left_ankle'),
+        17: dict(name='head', id=17, color=[255, 128, 0], type='upper', swap=''),
+        18: dict(name='neck', id=18, color=[255, 128, 0], type='upper', swap=''),
+        19: dict(name='hip', id=19, color=[255, 128, 0], type='lower', swap=''),
+        20: dict(name='left_big_toe', id=20, color=[255, 128, 0], type='lower', swap='right_big_toe'),
+        21: dict(name='right_big_toe', id=21, color=[255, 128, 0], type='lower', swap='left_big_toe'),
+        22: dict(name='left_small_toe', id=22, color=[255, 128, 0], type='lower', swap='right_small_toe'),
+        23: dict(name='right_small_toe', id=23, color=[255, 128, 0], type='lower', swap='left_small_toe'),
+        24: dict(name='left_heel', id=24, color=[255, 128, 0], type='lower', swap='right_heel'),
+        25: dict(name='right_heel', id=25, color=[255, 128, 0], type='lower', swap='left_heel')
+    },
+    skeleton_info={
+        0: dict(link=('left_ankle', 'left_knee'), id=0, color=[0, 255, 0]),
+        1: dict(link=('left_knee', 'left_hip'), id=1, color=[0, 255, 0]),
+        2: dict(link=('left_hip', 'hip'), id=2, color=[0, 255, 0]),
+        3: dict(link=('right_ankle', 'right_knee'), id=3, color=[255, 128, 0]),
+        4: dict(link=('right_knee', 'right_hip'), id=4, color=[255, 128, 0]),
+        5: dict(link=('right_hip', 'hip'), id=5, color=[255, 128, 0]),
+        6: dict(link=('head', 'neck'), id=6, color=[51, 153, 255]),
+        7: dict(link=('neck', 'hip'), id=7, color=[51, 153, 255]),
+        8: dict(link=('neck', 'left_shoulder'), id=8, color=[0, 255, 0]),
+        9: dict(link=('left_shoulder', 'left_elbow'), id=9, color=[0, 255, 0]),
+        10: dict(link=('left_elbow', 'left_wrist'), id=10, color=[0, 255, 0]),
+        11: dict(link=('neck', 'right_shoulder'), id=11, color=[255, 128, 0]),
+        12: dict(link=('right_shoulder', 'right_elbow'), id=12, color=[255, 128, 0]),
+        13: dict(link=('right_elbow', 'right_wrist'), id=13, color=[255, 128, 0]),
+        14: dict(link=('left_eye', 'right_eye'), id=14, color=[51, 153, 255]),
+        15: dict(link=('nose', 'left_eye'), id=15, color=[51, 153, 255]),
+        16: dict(link=('nose', 'right_eye'), id=16, color=[51, 153, 255]),
+        17: dict(link=('left_eye', 'left_ear'), id=17, color=[51, 153, 255]),
+        18: dict(link=('right_eye', 'right_ear'), id=18, color=[51, 153, 255]),
+        19: dict(link=('left_ear', 'left_shoulder'), id=19, color=[51, 153, 255]),
+        20: dict(link=('right_ear', 'right_shoulder'), id=20, color=[51, 153, 255]),
+        21: dict(link=('left_ankle', 'left_big_toe'), id=21, color=[0, 255, 0]),
+        22: dict(link=('left_ankle', 'left_small_toe'), id=22, color=[0, 255, 0]),
+        23: dict(link=('left_ankle', 'left_heel'), id=23, color=[0, 255, 0]),
+        24: dict(link=('right_ankle', 'right_big_toe'), id=24, color=[255, 128, 0]),
+        25: dict(link=('right_ankle', 'right_small_toe'), id=25, color=[255, 128, 0]),
+        26: dict(link=('right_ankle', 'right_heel'), id=26, color=[255, 128, 0]),
+    }
+)