🏃 MoveNet Pose Estimation

Estimate human poses using Google's MoveNet model. Supports single images and video files.

Upload Image

Input Image

Confidence Threshold

0 1

Results

Annotated Output

Pose Data

Upload Video

Confidence Threshold

0 1

Results

Processing Results

Issue #12: App development and pipeline integration

Endpoint alternative chosen: Gradio tab inside the existing app.py.

Input: one video file. Output: annotated cut 2D video, 3D skeleton animation video, keypoints CSV, and good/bad classification JSON.

Confidence threshold

0 1

A14: Advanced Exercise Pipeline

Features: Automated 'Ugly' recording rejection + 'Good/Bad' form classification.

Recording Quality Threshold

0.1 0.9

Recording Status

A15: Exercise Scoring (0–4 regression)

Score scale: 0 = perfect form, 4 = worst kept clip.

Bands:

GREEN < 1 — acceptable form
AMBER 1–2 — borderline, consider another take
RED ≥ 2 — poor form

The same upstream pipeline as A14 is reused (pose extraction + 3D lift + A12 start/stop cut). Decision-time of the NN and the overall response-time breakdown are reported alongside the score.

Recording Quality Threshold

0.1 0.9

Band

Score (0–4)

A16 — Final unified endpoint (3D alternative)

Record a clip with your webcam (or upload one), then click Run A16 endpoint. The result appears on the right: a video with the skeleton overlaid on your recording, plus the full Part-II chain output — pose → PoseNet→Kinect 2D → 2D→3D → start/stop cut → ugly/good-bad → 0–4 score.

Processing is currently CPU-only and runs the full chain end-to-end, so a 5-10 s clip can take roughly 20-60 s on the HF Space.

Recording quality threshold (detection-rate based — 0.6 = pose visible in 60% of the first 30 frames)

0.1 0.9

Status

MediaPipe 3D Pose Livestream

Live webcam pose estimation using MediaPipe Tasks (pose_landmarker_lite.task). The left panel shows the 2D skeleton overlay; the right panel shows the 3D world landmarks.

Webcam (input)

2D pose overlay

3D world landmarks

Features

Single Image Processing: Upload and process static images
Video Processing: Upload video files for pose estimation
17 COCO Keypoints: Detects nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles
Confidence Threshold: Adjust detection sensitivity
CSV/JSON Export: Download pose data for further analysis

Model Details

Model: MoveNet SinglePose (Lightning)
Input size: 192x192 pixels
Fast and efficient real-time pose estimation

Built with Gradio logo