← Back to portfolio
PROBLEM: A multi-agent system produced hundreds of unstructured messages a day across 8 VP agents — review time became the bottleneck.
WHY IT MATTERS: Without a classifier + routing layer, every message is a context-switch tax. With it, only the right thread reaches the right person, and per-session token cost becomes attributable.
STACK: Python (9-class classifier), SQLite (forum.db), Flask (localhost:5556 API), HTML/CSS/JS (this dashboard)

Parking Lot v3 — Full Post-Classifier Flow Map

What happens AFTER Layer 3 (classification). Every category traced with color + preemption scoring.

Overview: All Paths
Path A: Bypass
Path B: 2-Stage Scope
Path C: Scored Delivery
Path D: Auto-Process
Preemption Scorer
Full Category Matrix

Layer 4+: What Happens After Classification

Layer 3 output: category + subcategory + confidence. Layer 4 routes to one of 4 paths.

From Layer 3: Classified Issue
category + subcategory + confidence + skip_intake flag
Layer 4: CATEGORY ROUTER
Routes to one of 4 paths based on category
PATH A: BYPASS
No VP. No buffer. Respond and close.
question:factual → grep
question:metric → read file
question:status → read exec_log
question:cost → invoke analyst
question:process → VP Systems inline
report:sessionSKIP intake
report:shipped → ack + archive
correction:skill → direct edit
correction:naming → direct edit
✓ Auto-closed. Free. No interruption.
PATH B: 2-STAGE SCOPE
Scope first. Diane approves. THEN buffer.
feature:*
improvement:*
question:architecture
Stage 1: /improve MINIMAL
Sanity + domain + analyst brief
Headless. ~Sonnet. ~5 min.
Diane: "Worth exploring?"
[Yes --top X] or [No → close]
YES ▼
Stage 2: /improve FULL --top X
All 9 steps. Scout + Planner.
Headless. ~Opus. ~20 min.
Diane: "Approve for VP?"
[Approve] [Defer] [Reject]
APPROVE ▼
BUFFER (delivery gate)
Wait for VP capacity. Batch related.
VP receives pre-scoped work
PATH C: SCORED DELIVERY
Score decides: preempt, switch, or buffer.
bug:*
direction:task
proposal:*
correction:data/process
PREEMPTION SCORER
base + keywords + tone + age + task_penalty
≥70
PREEMPT
Drop task
40-69
SWITCH
At break
<40
BUFFER
VP pulls
VP acts
Proposals also need Diane approval
PATH D: AUTO-PROCESS
System handles. No VP. No Diane.
status update → Outstanding.md
direction:priority → P0 flag
process:* → VP Systems edit
✓ Auto-processed and closed.
bug feature improvement question correction direction proposal report process

Path A: Bypass — No VP, No Buffer, Respond and Close

Cheapest path. System answers directly. Zero interruption to VPs.

question:factual
e.g., "Where is the config file for reporter?"
grep/glob codebase
respond inline with file path
close issue
Cost: free | Time: <30s
question:metric
e.g., "What's the JPY Sharpe ratio?"
read tracker.csv / Views.md
respond with value + source
close issue
Cost: free | Time: <30s
question:cost
e.g., "How much did today's sessions cost?"
invoke analyst (Sonnet)
respond with cost breakdown
close issue
Cost: ~Sonnet | Time: ~2 min
correction:skill / correction:naming
e.g., "Rename the postmortem to include time"
VP Systems direct edit
reply: "Updated. Commit abc1234."
close issue
Cost: free | Time: ~5 min
report:session (VP reports)
e.g., "VP ML session: DirectLGBM shipped"
SKIP intake entirely
These are VP→Diane FYI. Not actionable issues. Filtered by classifier.
status update (blocker cleared)
e.g., "Registered in Play Store (F.2 done)"
detect milestone/blocker reference
update Outstanding.md + tracker.csv
reply: "F.2 marked done."
close issue
Cost: free | Milestone-specific update.

Path B: Two-Stage Scoping — Feature & Improvement Flow

Diane controls both gates. VP never sees unscoped work.

Issue classified as: feature or improvement or question:architecture
STAGE 1: /improve MINIMAL (headless, no VP)
What runs:
✓ Sanity check (grep for duplicates)
✓ Domain research (WebSearch)
✓ Analyst brief (1-paragraph ROI)
What does NOT run:
✗ Scout (prior art deep dive)
✗ Planner (implementation plan)
✗ Full gap analysis (9 steps)
Cost: ~Sonnet | Time: ~5 min | Artifacts: minimal_brief
GATE 1: Diane — "Worth exploring?"
"Exogenous variables for ML model"
Not duplicate. Domain research: macro indicators improve FX accuracy 15-30%.
Analyst: High confidence. Standard in FX literature.

Yes --top 5 No
Diane reviews async (phone/Forum Mobile). ~30 seconds.
NO ▼
CLOSED
No further cost.
Reason logged.
YES --top X ▼
STAGE 2: /improve FULL --top X
All 9 /improve steps run headless:
✓ Smoke test ✓ Domain deep dive ✓ KPI gap analysis
✓ Scout (prior art, build vs buy) ✓ Planner (milestones, tasks, estimates)
✓ Top X recommendations ranked by ROI
Cost: ~Opus | Time: ~20 min | Artifacts: improve_report + scout_report + implementation_plan
Diane controls X — how many results to produce (default 5, can set 3, 7, 10...)
GATE 2: Diane — "Approve for VP?"
Top 5 recommendations:
1. Add DXY index (est 2h, +12% MAPE)
2. Add VIX volatility (est 1h, +8% MAPE)
3. Add yield spreads (est 3h, +5% MAPE)
Plan: 3 milestones, 8 tasks, est 6h total
Scout: yfinance free tier, Alpha Vantage API
Approve Defer Reject
APPROVE ▼
NOW ENTERS BUFFER
state: approved
Linked to milestone + KPI in tracker
Waits for VP capacity via delivery gate
VP receives pre-scoped work
All artifacts attached. No in-session scoping.
Picked up via /next-task Level 2.5

Path C: Scored Delivery — Bugs, Directions, Proposals, Corrections

Preemption scorer decides WHEN the VP sees it. Buffer batches low-urgency items.

Issues routed to Path C:
bug:* direction:task proposal:* correction:data/process
PREEMPTION SCORER (5 components → 0-100)
40
base
(category)
30
keywords
(urgency)
20
tone
(escalation)
10
age
(staleness)
-20
penalty
(task progress)
≥70 PREEMPT
VP drops current task. Direct push.

When this fires:

Diane confirmed bug by testing (base=40)
Data loss/corruption detected (base=40)
"Stop!" + role violation (kw=30)
3+ rapid messages same issue (tone=20)

Real examples (from data):

• "Stop! Why are you updating skills??" → 80
• Email scanner 2nd bug + escalation → 95
• "very very important to me" (Discord) → 80
40-69 SWITCH
VP finishes current sub-task, then switches.

When this fires:

Stale state / redundancy catch (base=20+kw=20)
Domain knowledge injection (base=20+bonus)
"did you check..." (kw=15)
Prior decision override (base=30)

Real examples:

• "You only worked for 2 hours" → 65
• "M9-M12 are already done" → 40
• "did you check forum?" → 45
<40 BUFFER
Parked. VP pulls via /next-task when ready.

When this fires:

New bug report (not confirmed) (base=15)
Direction without urgency (base=15)
Proposal for later (base=15)

Buffer behavior:

• Batched by VP × priority × category
• 3 UI bugs → delivered as 1 batch
• Age escalation: +10 if open 2+ sessions
• VP pulls at /next-task Level 2.5

Real examples:

• Q.58b "Play Store next steps" → 25
• "I'd like diversity of datasets" → -2

Scorer Retraining

After each preemption decision, log: {issue_id, score, band, diane_agreed: yes/no}
Weekly: retrain component weights from Diane's labels. Same approach as classifier retraining.
Target: <2 misclassified preemptions per week.

Path D: Auto-Process — System Handles, No Human Needed

Status Update (blocker cleared)
Diane: "Registered in Play Store (F.2)"
Detect milestone reference (F.2)
Update Outstanding.md: F.2 → [x]
Update tracker.csv: milestone status
Reply: "F.2 marked done. Milestone updated."
Close issue
direction:priority (P0 flag)
Diane: "Stop work on X, focus on Y"
Always preempts (score auto ≥70)
Reprioritize VP tracker
Reply: "Priority changed. VP notified."
process:* (rule changes)
Diane: "All VPs must check Forum before ending"
VP Systems direct edit to SKILL.md
Reply: "Rule encoded. Commit abc1234."
Close issue

Preemption Scorer — Full Component Breakdown

ComponentSignalScore
Base Category (0-40)
QA failure confirmed by Diane using product40
Data loss / corruption detected40
Infrastructure failure blocking comms35
Prior decision being violated30
Role/process boundary violation30
Cost waste at calculable rate25
Domain knowledge injection20
Stale state / redundancy catch20
Feature priority pivot15
New unconfirmed bug report15
Design iteration feedback5
Urgency Keywords (0-30)
"Stop!" / "very very" / "this is critical"30
"I found a bug" / "X is broken" / two bugs same session25
"based on earlier conversations we decided..."20
"why is X..." (active use, something missing)20
"did you check..."15
"how about..." (domain question)10
"I was thinking..." / "I'd like..."3
Tone (0-20)
3+ messages rapid succession about same issue20
Past-tense confirmation ("I tested it, it...")15
Explicit cost calculation presented15
Escalation from earlier in same session10
Single informational message0
Age (0-10)
Issue raised 2+ sessions ago, still open10
Issue raised earlier same session5
New issue this session0
Current Task Penalty (0 to -20)
Task >80% complete (about to commit)-20
Task 50-79% complete-10
Task <50% / debugging / stuck0
Task is meta/process (not user-facing)+5

Calibration: 9 Real Events

EventBaseKwToneAgePenTotalBandMatch?
"Stop! Why are you updating skills??"3030155080PREEMPT
Email scanner 2nd bug + escalation4025205+595PREEMPT
"very very important to me" (Discord)3530150080PREEMPT
"You only worked for 2 hours"3020150065SWITCH
"M9-M12 are already done"202000040SWITCH
"did you check forum?"2015100045SWITCH
"how about exogenous variables?"201000-1020QUEUE~
Q.58 "registered + need breakdown"151000025QUEUE
"I'd like diversity of datasets"5300-10-2QUEUE
8/9 correct. 1 edge case (~): Diane's domain questions during ML work occasionally have paradigm-level ROI. Needs VP-ML context bonus.

Full Category × Path × Action Matrix

CategorySubcategoryPathScorer?Buffer?VP?Diane?Action
buguiCYesScorevp-mobileNoFix + verify build
buginfraCYesScorevp-systemsNoFix daemon/service
bugdataCYesScoreownerNoFix data pipeline
bugcrashCYes (≥70)NoownerNoP0 — preempt always
bugregressionCYes (≥70)NoownerNoP0 — preempt always
featurenewBNoAfter approveAfter approve2x/improve min → full → VP
featureenhancementBNoAfter approveAfter approve2x/improve min → full → VP
featureinfraBNoAfter approvevp-systems2x/improve min → full → VP
featureuxBNoAfter approvevp-mobile2x/improve min → full → VP
improvement*BNoAfter approveAfter approve2x/improve min → full → VP
questionfactualANoNoNoNogrep → respond → close
questionmetricANoNoNoNoread file → respond → close
questionstatusANoNoNoNoread exec_log → respond → close
questioncostANoNoNoNoinvoke analyst → respond → close
questionprocessANoNoNoNoVP Systems inline → close
questionarchitectureBNoAfter approveVP1x/improve min → Diane → VP
correctionskillANoNoNoNoVP Systems direct edit → close
correctionnamingANoNoNoNoVP Systems direct edit → close
correctiondataCYesScoreownerNoFix data
correctionprocessCYesScorevp-systemsNoFix process/rule
directiontaskCYesScoreownerNoCreate task → VP
directionpriorityDAuto ≥70NoYesNoReprioritize immediately
proposalarchitectureCYesScoreVPYesScore → VP + Diane approval
proposalorgCYesScoreYesDiane only
proposalprocessCYesScorevp-systemsYesScore → VP + Diane approval
reportsessionSKIPNoNoNoNoFiltered out of intake
reportshippedANoNoNoNoAck + archive → close
process*DNoNovp-systemsNoDirect edit → close
status update*DNoNoNoNoOutstanding + tracker → close