Kettlebell Form has moved from a simple form-checking idea into a browser-based swing coach. The current repo is a Vite, React, and TypeScript app that runs MediaPipe Pose Landmarker against a webcam stream and keeps the analysis on device.
The app keeps the coaching loop local: camera in, pose inference in the browser, feedback immediately on the same screen.
The target movement is deliberately narrow: a two-hand hardstyle/Russian kettlebell swing. That choice matters because "good form" is not a single universal shape. The app is trying to read a hip-dominant hinge pattern, neutral torso and head stack, relaxed arms, and a bell path driven by hip extension.
Before scoring, the coach asks for a standing calibration sample. That gives it a personal baseline for upright hip angle, knee angle, torso lean, body proportions, visibility, and jitter. From there it can evaluate swing phase, rep count, hinge-to-knee ratio, lockout, shoulder lift, spine stack, depth travel, camera quality, and confidence.
The useful part is not counting movement. It is separating a hip-dominant swing from a squat, arm pull, shallow backswing, or weak lockout.
The feedback model is intentionally practical. It flags a missing side-view or poor framing, squat-dominant backswing, shallow depth, incomplete hip extension, arms lifting the bell, and spine or neck stack issues. The app estimates the bell from the wrist midpoint, so it stays lightweight enough for a live browser session.
The latest visual layer is the biggest shift from the original concept. The mirrored camera feed is projected into body, muscle, skeleton, and Gaussian correction overlays, while a Three.js scene shows a simplified 3D anatomy rig driven by the same pose landmarks.
The anatomy layers make the model legible: body, muscle, skeleton, and field overlays show what the coach thinks it is seeing.
Those layers are not pretending to be clinical motion capture. They are a product design choice: the user needs to understand why the app is giving a cue, not just see a red or green score. Showing uncertainty, joint risk, posterior-chain demand, shoulder-lift risk, and depth travel makes the system easier to interrogate.
The current limits are still explicit. Monocular pose landmarks are not calibrated multi-camera biomechanics, dense depth is not fused yet, and pain or rehab decisions still belong with a qualified human coach or clinician. The next serious engineering step would be an optional dense-depth worker, likely using Depth Anything V2 or a WebGPU depth model, fused with pose landmarks and camera calibration.
