5 verified gesture videos (ask_child, celebrating, encouraging, look_at_letter, thinking) — me doing the gesture in front of a webcam. Plus BABEL/AMASS public MoCap data for seed gestures (idle, walk_in, waiting, helping).
Per-frame 2D landmarks → CharacterScaler per-animal → schema (magnitudes per body part) + curves (temporal shapes) + auto specs (engine-ready tracks). Outputs a wired .riv file.
Kid taps a sound in the L2R Android app. The wired .riv plays the gesture (animal celebrates, encourages, asks). Ships in production V8 release.
Four-step pipeline. Steps 0 and 1 are the ML core; steps 2 and 3 are the wiring + verification.
shutil.copy2(original_riv, output_dir/starter_copy.riv). This is the kind of safety guarantee you only learn to want after losing two days of art work to a buggy script.
A. CharacterScaler from manifest
arm_translation_scale = arm_length / human_arm_reach
arm_translation_cap = arm_length (hand can't go past wrist)
lateral_sway_cap = body_width × 0.5
Chick: arm_length=156px → scale=0.277, cap=156px, sway=212px
Bigger animal auto-adjusts. No hardcoded values.
B. Load configs (animation_defaults.json, gesture_disambiguation.json)
C. For each gesture:
Read Babel segment JSON (60 frames × 3 segments)
_derive_schema_entry() — percentile(95) of arm angle, root.y, body.rot, head.rot
_apply_video_priority() — Diane's video magnitudes are the FLOOR; Babel can ADD but not OVERRIDE
_apply_gesture_disambiguation() — enforces look_at_letter≠ask_child even if MoCap labels overlap
_derive_curve_entry() — savgol smooth → cycle detect → RDP reduce
→ motion_shape, peak, p95, cycle_period
D. auto_gesture_specs.generate_specs_for_gesture()
Converts curves + schema → engine-ready track specs
{ role: "arm_left", prop: "rotation", pattern, mag, dur }
FPS scaling: source 20fps (Babel) → playback 60fps → 3× longer cycles
Prevents: celebrating 0.4s jitter → 1.3s graceful cycle
gesture_schema_v3.json, motion_curves.json, auto_gesture_specs.json, bundle_metadata.json, motion_curves_dashboard.html (interactive frame browser)
prompt_1: ViewModel + State Machine + boolean inputs
prompt_2: Verify SVG groups match manifest group_map
prompt_3: Emit idle animation
prompt_4: Blink overlay (independent layer)
prompt_5: Mouth visemes (6 shapes, independent layer)
prompt_6: ALL gesture poses
prompt_7: Audit probe (list_objects verification)
prompt_8: State machine (schema-driven transitions, 3 layers)
prompt_9: Post-wire verification
prompt_10: Export .riv
gesture_engine.emit_gesture() — the keyframe emitter
For each track in auto_gesture_specs:
_mag() — magnitude from bundle schema
Pattern — (frame, value, easing) list
_clamp() — POSE_LIMITS safety envelope
Pivot resolution: parts_metadata.pivot → layers.pivot → artboard center
~1200 sequential MCP calls per animal (chick reference)
<animal>_babel_tuned_wired.rivThis is not a single-model story. It's a pipeline story — and pipelines are where most "ML in production" failures actually happen. Each step is small, deterministic, and replaceable. The character scaling step doesn't know about the curve smoothing step. The keyframe emitter doesn't know about the eval. Each stage can be swapped (e.g., RTMLib → MediaPipe; Babel MoCap → custom data) without touching the others.
Two design choices worth calling out:
python3 scripts/run_external_preview_batch.py rebuilds all animals.
projects/rive_animation/new_pipeline/runner/external_previewer.pyvault/2 dev/vp-ml-neural/rive_animation/diagrams/2026-04-18-external-previewer-pipeline-diagram-v2.md