Skip to content

Add MMPose wholebody estimator#231

Open
bricksdont wants to merge 9 commits into
sign-language-processing:masterfrom
bricksdont:add-mmpose-wholebody
Open

Add MMPose wholebody estimator#231
bricksdont wants to merge 9 commits into
sign-language-processing:masterfrom
bricksdont:add-mmpose-wholebody

Conversation

@bricksdont

@bricksdont bricksdont commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR adds support for running MMPose wholebody inference on a video and converting the result into the library's .pose format.

  • pose_format/utils/mmposewholebody.py — new module with estimate_mmpose_wholebody(input_path, fps, width, height). Runs MMPoseInferencer('wholebody') on a video file and returns a Pose object using the canonical COCO Wholebody 133 header (from cocowholebody133_header.py, introduced in Add COCO Wholebody 133 keypoint schema and generic.py support #226). Frames with no detected person are represented as fully-masked zero rows, preserving temporal alignment with the source video.
  • pose_format/utils/mmposewholebody_test.py — 8 unit tests covering header structure, output shape, masked empty frames, and the all-empty-video case. MMPose is stubbed via sys.modules so tests run without an actual MMPose installation.
  • pyproject.toml — new mmpose optional install target (pip install pose-format[mmpose]) with version floors derived from a known-good OpenMMLab 2.x combination.

Design notes

Not wired into the CLIvideo_to_pose / videos_to_poses currently only support mediapipe. MMPose could be added to the CLI later; this PR intentionally keeps that scope separate.

Format vs. estimator — MMPose wholebody outputs COCO Wholebody 133 keypoints. A .pose file produced by this function is structurally identical to one from any other COCO Wholebody 133 estimator (OpenPifPaf, SDPose, etc.). detect_known_pose_format correctly identifies such files as "coco_wholebody_133". MMPose is therefore an estimator that produces an existing named format, not a new format in its own right.

mmcv GPU install caveat — plain pip install mmcv installs a CPU-only build; GPU users need the OpenMMLab CUDA-specific index. The ImportError message in the module links to the mmcv install guide.

Relationship to other PRs

This builds on #226 (COCO Wholebody 133 canonical header). Future PRs will add OpenPifPaf and SDPose estimators using the same header.

Test plan

  • pytest src/python/pose_format/utils/mmposewholebody_test.py -v passes without MMPose installed
  • pytest src/python/pose_format/utils/generic_test.py -v passes (no regressions)
  • With MMPose installed: estimate_mmpose_wholebody("video.mp4", fps=25, width=1920, height=1080) returns a Pose with detect_known_pose_format"coco_wholebody_133"

🤖 Generated with Claude Code

bricksdont and others added 9 commits June 23, 2026 10:45
Adds mmposewholebody.py wrapping MMPoseInferencer('wholebody') to load a video
into a Pose using the canonical COCO Wholebody 133 header from
cocowholebody133_header.py. MMPose is an optional dependency; importing the
module without it installed raises ImportError immediately (same pattern as
holistic.py / mediapipe).

Frames where no person is detected are not skipped — a zeroed, fully-masked row
is inserted instead so the output frame count stays aligned with the video.
Downstream code can distinguish "no detection" from a real keypoint via the mask.

Tests cover: header/components (no MMPose required), output shape and metadata,
empty-frame masking, all-empty video, and version default. MMPoseInferencer is
mocked via sys.modules stubs so the test suite runs without MMPose installed.

Co-Authored-By: catherine-o-brien <catherine-o-brien@users.noreply.github.com>
Co-Authored-By: GerrySant <GerrySant@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…imators

Restructures "Integration with External Data Sources" to label OpenPose support
as core and AlphaPose / MMPose as experimental. Adds MMPose wholebody loader
example with install instructions.

Co-Authored-By: catherine-o-brien <catherine-o-brien@users.noreply.github.com>
Co-Authored-By: GerrySant <GerrySant@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a [project.optional-dependencies] mmpose group listing mmcv, mmengine,
mmdet, and mmpose (all >=their first stable 1.x/2.x releases), installable via
pip install pose_format[mmpose].

Co-Authored-By: catherine-o-brien <catherine-o-brien@users.noreply.github.com>
Co-Authored-By: GerrySant <GerrySant@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Version floors tightened to match a known-good OpenMMLab 2.x combination
(mmcv 2.1.0 / mmengine 0.10.7 / mmdet 3.3.0 / mmpose 1.3.2) verified by
ZurichNLP's install script. Also clarifies that mmcv for GPU requires the
OpenMMLab CUDA-specific index, not plain pip install.

Co-Authored-By: catherine-o-brien <catherine-o-brien@users.noreply.github.com>
Co-Authored-By: GerrySant <GerrySant@users.noreply.github.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MMPose runs inference on raw video rather than loading a pre-existing
keypoint format, so it does not belong in this section. Mediapipe Holistic
is handled via the CLI (section 2) and also does not fit here.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
"load" implies ingesting a pre-existing keypoint file (like OpenPose/AlphaPose
loaders); this function runs inference on raw video, so "estimate" is more accurate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bricksdont

Copy link
Copy Markdown
Collaborator Author

mmpose dependencies are rather heavy, so they are excluded from the tests / CI

@bricksdont bricksdont requested a review from AmitMY June 23, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant