← back to portfolio

Problem: Kids' phonics app needs expressive face/body gestures for 10 animal characters across 9 states each. No animator on staff. Hand-keyframing is months of work that doesn't scale to "add a new animal next month."

What I built — three layers:

Data-driven gestures — extract pose features from my own gesture videos (RTMLib, 133 keypoints/frame), aggregate to a ~30-parameter schema per gesture. The body becomes the data source.
Config-driven pipeline — per-animal manifest declares the contract; wiring step emits ~1,200 keyframe calls deterministically; Kotlin reads the same state-machine names. No hardcoded gesture names anywhere.
Validators as testable contracts — 4-dimension score (Spec / Silhouette / Distinctiveness / Safety), 13 geometry-and-motion validators, append-only score log. The objective function is in place; parameter sweeps are next.

Result: Production-shipping in L2R V8 (Play Store). 6 animals × 9 gestures, all auto-derived. Adding the 10th animal is a manifest change.

Stack: Python (RTMLib pose extraction, NumPy, savgol smoothing, pytest validators), Rive runtime (Kotlin Android), JSON manifests for the contract, Rive scripting via MCP for keyframe emission.

Learn to Read · Rive Animation Pipeline

Teaching animals to gesture

My one-minute version: I started by manually tuning a single animal in Rive. Then I automated it with Python. Then I realized the most credible source for animation magnitudes is my own body — record the gesture, extract the features, let the schema drive everything downstream. Three deep dives below.

animals shipped
(10 planned)

gesture states each

gesture videos
(my own body)

~30

params per gesture
(from 100 frames)

code changes
per new animal

The interesting ML problem isn't "extract poses from video" — that's a solved library call. The interesting problem is making the extracted poses actually drive a believable animation on a stylized animal that has nothing to do with a human skeleton, and then knowing whether what you shipped is any good. That's three problems: data, architecture, evaluation. One per card.

▶ See it running

Watch the wired animal play in the Learn to Read app — kid taps a sound, animal celebrates. One command rebuilt this from my video.

↓ Press play

From hardcoded to data-driven

How my own gesture videos became ~30 measured parameters per gesture, plus the rest-frame and pixel→Rive problems I had to solve.

Read the deep-dive →

manifest · per animal

10-prompt wiring → .riv

Kotlin · same state-machine names

Config-driven pipeline

From manifest upstream through wiring to Kotlin downstream — no gesture name is hardcoded. The contract crosses Python ↔ Kotlin as a stable schema.

Validators as testable contracts

A 4-dimension gesture score and 13 validators turn animation rules into tests. The objective function is in place; the sweep is next.

Read the deep-dive →

Outcome: The Rive pipeline ships in the L2R V8 Android release (Play Store gate passed). Six animals × nine gesture states each, all auto-derived from the schema. Adding a new gesture is a recording. Adding a new animal is a manifest. The pipeline is reproducible: python3 scripts/run_external_preview_batch.py rebuilds every animal from source.

Other drill-downs

📷 Frame browser — per-frame pose extraction 🐥 Chick previewer — live wired .riv 📊 Knowledge graph (1.3 MB, opens new tab)