v6 Stages × Extractors Matrix¶
Status: DRAFT (Phase 0 companion artifact) Parent spec:
conversation-v6-feature.md§2.5, §3.4, §3.1 Sibling:v6-stage-resolver-truth-table.md(Phase 0 companion) Blocks: v6 Phase 2b (config/prompts/stages.yaml+extractors_activewiring) Owner: Backend / Agents Last updated: 2026-05-12
This document maps every v6 conversation stage to the extractor pipeline that runs during it. It is wiring, not content — prompt authoring lives in Phase 2b. The matrix is the source of truth for the extractors_active: [list] field that each stages.yaml entry carries (per v6 spec §3.4 mitigation).
1. Stage list¶
The 12 stages below match stage_id keys in the sibling truth-table doc. support is the fallback bucket — every malformed-state or no-rule-matches case lands here.
stage_id |
Goal |
|---|---|
discovery |
Find out what procedure / why-now / where they're starting from |
procedure_identification |
Confirm specific procedure + laterality / anatomy / mechanism |
records_collection |
Get medical records uploaded for clinical context |
match_review |
Patient reviews matched providers, asks questions, picks subset |
consent_capture |
HIPAA / GDPR / DPDP consent for record forwarding |
mso_offer |
Offer Medical Second Opinion video consult before booking |
scheduling |
Pick MSO video slot / book consultation |
pre_travel |
Logistics, visas, passport, travel readiness |
in_treatment |
During-stay support (mostly coordinator-handled; agent is light-touch) |
recovery_offer |
Post-op recovery facility offer (ADR-0018 §K) |
recovery_followup |
Post-op milestone check-ins (ADR-0018 §K) |
support |
Fallback — malformed state or no rule matches; base + patient_context only |
Total: 12 stages (11 forward-flow + 1 fallback).
2. Extractor catalog¶
Sourced from app/services/extractors/*.py (verified on main 2026-05-12) plus the v6-introduced gate. The existing 5 layer extractors plus recovery_checkin_extractor (landed via PR #832) cover every documented v6 stage. No new extractors are added by v6 — the v6 work is wiring + system-prompt language sweep per spec §3.4.
extractor_id |
Source file | Tier | Caching | Failure mode |
|---|---|---|---|---|
intent_extractor |
app/services/extractors/intent_extractor.py |
Haiku 4.5 | None (per-turn LLM call; output merged into layer_state.intent_capture) |
Returns empty delta on LLMGatewayError; layer completion stays at previous value (_base.py:97-99) |
medical_extractor |
app/services/extractors/medical_extractor.py |
Haiku 4.5 | ICD-10/SNOMED code map cached 30 days in Redis under icd_cache:v2:dx=...|proc=... (line 27) |
Same non-blocking pattern; ICD map degrades to empty list if cache + LLM both fail |
travel_extractor |
app/services/extractors/travel_extractor.py |
Haiku 4.5 | None at extractor level; transport-tier table loaded once at import via scoring_config.get_config |
Returns empty delta; compute_transport_tier() falls back to T1 if scores all zero |
logistics_extractor |
app/services/extractors/logistics_extractor.py |
Haiku 4.5 | None; VISA_TABLE is static module-level dict (line 48) |
Returns empty delta; visa derivation returns None for unknown country pairs |
financial_extractor |
app/services/extractors/financial_extractor.py |
Haiku 4.5 | None at extractor level | Returns empty delta; qualify_budget() returns incomplete when budget falsy |
recovery_checkin |
app/services/extractors/recovery_checkin_extractor.py |
Haiku 4.5 | Escalation keyword YAML cached via @lru_cache(maxsize=1) (line 69) |
Deterministic regex keyword pass still runs on LLM failure (line 27); only LLM-derived fields go empty |
Input shape (uniform across all six — see _base.run_extraction, lines 49-122):
message_text: str— the latest user turn's contentexisting_data: dict[str, Any]— the current per-layer slice fromPatientLayerStatetenant_id: str | None+case_id: str | None— propagated for Langfuse session/user tagging
System prompt is static per extractor (hardcoded module-level constant); the v6 §3.4 sweep replaces "layer N" wording with stage-equivalent language but preserves schemas. No patient content embedded in system prompts (§3.4 CI gate tests/test_extractor_prompts_pii_safe.py).
Output schema (uniform envelope; extractor-specific delta inside):
{
"delta": { /* extractor-specific JSON keys, matches schema in the extractor's SYSTEM_PROMPT */ },
"completion_estimate": 0.0 // float in [0, 1]
}
The reducer (_merge_layer_deltas, triage_agent.py:323-) merges deltas into layer_state using high-water-mark on completion + deep merge on data.
3. Stage × Extractor matrix¶
Legend:
run— extractor is in this stage'sextractors_activelist and always invoked when its signal gate firesskip— extractor not wired; orchestrator does not spawn it for this stagecond(<predicate>)— extractor is wired but only invoked when the predicate is true
Column abbreviations: INT=intent, MED=medical, TRV=travel, LOG=logistics, FIN=financial, REC=recovery_checkin.
| Stage | INT | MED | TRV | LOG | FIN | REC |
|---|---|---|---|---|---|---|
discovery |
run | run | skip | skip | skip | skip |
procedure_identification |
run | run | skip | skip | skip | skip |
records_collection |
run | run | skip | skip | skip | skip |
match_review |
cond(decision_stage_volatile) | run | run | run | run | skip |
consent_capture |
cond(decision_stage_volatile) | run | skip | skip | skip | skip |
mso_offer |
run | run | skip | skip | run | skip |
scheduling |
skip | run | run | run | run | skip |
pre_travel |
skip | run | run | run | run | skip |
in_treatment |
skip | run | skip | skip | skip | skip |
recovery_offer |
skip | run | run | skip | run | skip |
recovery_followup |
skip | run | skip | skip | skip | run |
support |
run | run | skip | skip | skip | skip |
run cell count: 30 across 12 stages × 6 extractors (= 72 cells total; 30 run, 2 cond, 40 skip).
Conditional predicate — decision_stage_volatile:
intent.decision_stage in {comparing_options, just_exploring}
OR
patient_state.layer_state.intent_capture.completion < 0.8
Used in match_review + consent_capture to re-run intent_extractor when the patient is still pivoting (per v6 spec §2.6 "backward stage moves" — match_review → procedure_identification is intentional when the patient pivots procedures). When intent is locked, skip the extractor to save a Haiku call.
3.1 Per-run cell: input slice + output consumer¶
For each run cell, the input is always the latest user message + the existing layer slice (per _base.run_extraction). The differentiator is which triage decision in the stage consumes the merged output. Mapping below uses stage_id rows; entries in italic are stages where the extractor runs but is not load-bearing for the stage's primary advance predicate (signal-gated catch).
| Cell | Output consumer in this stage |
|---|---|
discovery × INT |
procedure_identified flag (workflow_state); urgency / emotional readiness for guidance tone |
discovery × MED |
First-pass procedure name → flips procedure_identified → advances to procedure_identification |
procedure_identification × INT |
Refines decision_stage (just_exploring → comparing_options) — feeds match_review gate |
procedure_identification × MED |
Confirms procedure + body_site/laterality + comorbidities → unblocks records_collection |
records_collection × INT |
Catch — signal-gated; runs if patient mentions urgency or new fear while uploading |
records_collection × MED |
Symptoms, medications, allergies, age — populates FHIR observations + advances medical_status.completion ≥ 0.7 |
match_review × INT (cond) |
Detects pivot (decision_stage → comparing_options) → triggers backward stage move per §2.6 |
match_review × MED |
New comorbidity / medication captured mid-review → re-scores match weights |
match_review × TRV |
First time mobility / oxygen surface — transport tier feeds match relevance |
match_review × LOG |
Country preferences + timeline → narrows shortlist |
match_review × FIN |
Budget band + insurance preauth → match qualification (qualified / stretch / mismatched) |
consent_capture × INT (cond) |
Detects last-minute hesitation → guidance softens, never re-asks consent until intent stabilises |
consent_capture × MED |
Catch — captures any new comorbidity revealed during consent reading; FHIR addendum |
mso_offer × INT |
Reads primary_fear → MSO offer framing (fear of complications → emphasise second opinion) |
mso_offer × MED |
Catch — symptoms surfaced during offer can re-rank consult urgency |
mso_offer × FIN |
Insurance coverage of MSO consult → affects offer wording (covered vs out-of-pocket) |
scheduling × MED |
Time-zone / fitness affecting consult timing |
scheduling × TRV |
Mobility → in-person vs video consult preference |
scheduling × LOG |
Timezone (country_of_residence) → consult slot proposal |
scheduling × FIN |
Catch — payment method surfaced during scheduling triggers payment-link card |
pre_travel × MED |
Recent medication changes affecting travel fitness |
pre_travel × TRV |
Transport tier (T1-T4) → travel-readiness checklist |
pre_travel × LOG |
Visa derivation + companion + timeline → pre-travel readiness card |
pre_travel × FIN |
Catch — final cost confirmation before flights |
in_treatment × MED |
Light-touch: medication / symptom updates flowing to coordinator |
recovery_offer × MED |
Procedure category determines recovery-facility eligibility |
recovery_offer × TRV |
Current mobility → recovery facility tier match |
recovery_offer × FIN |
Recovery facility cost vs budget → offer framing |
recovery_followup × MED |
Medication adherence, symptom check via free text |
recovery_followup × REC |
Primary: pain_level + escalation keywords → Telegram alert + coordinator escalation (ADR-0018 §K) |
support × INT |
Captures any signal the patient drops while resolver is stuck — kept warm so recovery from support is fast |
support × MED |
Same — opportunistic capture; never advances stage from support alone |
The cells marked catch run via the existing signal-gate mechanism (detect_layer_signals, triage_agent.py:297-) — already production-shipped behind flag extractor_signal_gating_enabled. v6 re-uses this gate verbatim. Phase 2b only declares extractors_active in stages.yaml; the gate handles the conditional skip.
4. Knowledge addendum gating¶
Per v6 spec §2.3 + §2.5 hard constraint: at most 1 knowledge addendum attaches per turn (Anthropic 4-cache-breakpoint limit). Addendums are triggered by patient context, not by stage — but stage interacts with the trigger predicate. Cross-reference with the sibling truth-table doc:
| Addendum | Trigger predicate (from §2.3) | Stages where this addendum is most likely to attach |
|---|---|---|
procedure_clinical_facts/{procedure_slug}.yaml |
patient.procedure identified |
procedure_identification, records_collection, match_review, mso_offer, consent_capture |
financial_options.yaml |
Patient asked about cost OR budget tier undetermined | match_review, mso_offer, scheduling, pre_travel, recovery_offer |
post_travel_logistics.yaml |
stage_active in {pre_travel, in_treatment} AND patient.passport_status != confirmed |
pre_travel, in_treatment (predicate is stage-scoped) |
insurance_handling.yaml |
Patient mentioned insurance OR funding_source == insurance |
match_review, consent_capture, mso_offer, scheduling, recovery_offer |
mso_second_opinion.yaml (migrated from v4 line 204) |
mso_patient_offer_enabled flag true AND stage in {mso_offer, scheduling} |
mso_offer, scheduling |
Priority ordering (per v6 spec §2.3 — clinical-safety addendums above commercial):
procedure_clinical_facts(clinical) — highest prioritypost_travel_logistics(clinical-adjacent — fitness-to-travel)insurance_handling(commercial)financial_options(commercial)mso_second_opinion(commercial) — lowest priority
When multiple match, the highest-priority addendum wins; the rest are dropped that turn (§2.3 cap of 1).
Stage-by-stage default addendum (the one that fires most often, given typical patient state at that stage):
| Stage | Default addendum | Notes |
|---|---|---|
discovery |
none | Procedure not yet identified; no triggers met |
procedure_identification |
procedure_clinical_facts/{slug}.yaml (once procedure surfaces) |
Drops the moment medical_extractor pins procedure.name |
records_collection |
procedure_clinical_facts/{slug}.yaml |
Stable |
match_review |
financial_options.yaml OR procedure_clinical_facts/{slug} |
Tie broken by priority — clinical wins |
consent_capture |
insurance_handling.yaml (if insurance) else procedure_clinical_facts/{slug} |
Stable |
mso_offer |
mso_second_opinion.yaml (gated by mso_patient_offer_enabled) |
Falls back to procedure_clinical_facts if MSO offer off |
scheduling |
financial_options.yaml |
|
pre_travel |
post_travel_logistics.yaml |
Predicate already stage-scoped |
in_treatment |
post_travel_logistics.yaml (residual logistics issues) |
Light-touch overall |
recovery_offer |
financial_options.yaml |
Cost is the chief recovery-offer concern |
recovery_followup |
none | Recovery turn is intentionally narrow — extractor + base only |
support |
none | Per §2.5 — support collapses to base + patient_context only |
The full predicate gating belongs in app/agents/knowledge_addendum_resolver.py (new in Phase 2b); this table is the cheat-sheet for the resolver.
5. Parallel execution graph¶
Per triage_agent.py:953-957 (asyncio.gather(...) on the extractor task list), today's pipeline runs all signaled extractors in parallel within a single map-reduce node. v6 preserves this; the only delta is which extractors enter the gather group.
Hard data dependency (one rule today, one rule v6 may add):
-
medical_extractorreadsintentclassification implicitly (procedure framing depends on whether decision_stage indicates the patient is asserting vs exploring). Today this is resolved by the LLM readingexisting_data(the prior layer_state), so the order is "previous turn's intent informs this turn's medical." There is no in-turn dependency that would prevent parallel execution; the LLM readsexisting_datasnapshot. -
recovery_checkinis independent of all others — its escalation logic (regex keyword scan) is deterministic and does not consume other extractors' deltas. It can run in parallel withmedical_extractorinrecovery_followup.
Per-stage parallel groups (everything within a row runs concurrently via asyncio.gather):
| Stage | Parallel group | Critical path |
|---|---|---|
discovery |
{INT, MED} | max(haiku_latency × 2 calls) |
procedure_identification |
{INT, MED} | max(haiku) — both Haiku |
records_collection |
{INT, MED} | max(haiku) + ICD cache lookup |
match_review |
{INT(cond), MED, TRV, LOG, FIN} | max(haiku) — 5-wide gather |
consent_capture |
{INT(cond), MED} | max(haiku) |
mso_offer |
{INT, MED, FIN} | max(haiku) |
scheduling |
{MED, TRV, LOG, FIN} | max(haiku) — 4-wide |
pre_travel |
{MED, TRV, LOG, FIN} | max(haiku) — 4-wide |
in_treatment |
{MED} | haiku |
recovery_offer |
{MED, TRV, FIN} | max(haiku) |
recovery_followup |
{MED, REC} | max(haiku, regex_keyword_scan) |
support |
{INT, MED} | max(haiku) |
Per-Haiku-call latency budget: ~600-900 ms p50, ~1500 ms p95 (per current Langfuse data, model_registry routes all to claude-haiku-4.5). The 5-wide match_review gather is the worst case but still ~1500 ms p95 since all five Haiku calls run concurrently.
Future: when a future extractor adds an explicit consumer of another's delta (e.g., a hypothetical risk_extractor that reads medical.comorbidities), the gather group must split into a two-phase DAG. Out of scope for v6.
6. Token budget per stage¶
Per extractor call (averaged across the 6 extractors at Haiku tier):
- Input: ~1,200 tokens (system_prompt ~800 + user_content with existing_data + message ~400)
- Output: ~400 tokens (JSON delta + filler)
- Per call cost: ~$0.0002 (Haiku 4.5 published rates, see docs/reference/llm-evaluation.md)
Per-stage extractor cost = (count of run cells in row) × ~$0.0002.
| Stage | Extractor calls (steady state) | Extractor input tokens | Extractor output tokens | Cost (¢, Haiku) |
|---|---|---|---|---|
discovery |
2 | 2,400 | 800 | ~0.04 |
procedure_identification |
2 | 2,400 | 800 | ~0.04 |
records_collection |
2 | 2,400 | 800 | ~0.04 |
match_review |
4-5 (cond gate on INT) | 4,800-6,000 | 1,600-2,000 | ~0.08-0.10 |
consent_capture |
1-2 (cond gate on INT) | 1,200-2,400 | 400-800 | ~0.02-0.04 |
mso_offer |
3 | 3,600 | 1,200 | ~0.06 |
scheduling |
4 | 4,800 | 1,600 | ~0.08 |
pre_travel |
4 | 4,800 | 1,600 | ~0.08 |
in_treatment |
1 | 1,200 | 400 | ~0.02 |
recovery_offer |
3 | 3,600 | 1,200 | ~0.06 |
recovery_followup |
2 | 2,400 | 800 | ~0.04 |
support |
2 | 2,400 | 800 | ~0.04 |
Per-turn extractor token budget (worst case, match_review): ~6,000 in + 2,000 out (extractor pipeline only).
Total per-turn cost (extractor pipeline + main conversation LLM call) at worst-case match_review:
- Extractor pipeline: ~$0.0010 (5 Haiku calls)
- Main conversation call (per §2.5): ~4,400 input tok + 600 output tok at Haiku = ~$0.0008
- Combined: ~$0.0018 / turn at the most expensive stage
Steady-state per-turn budget across all stages (weighted by stage occupancy): ~$0.0010 / turn (matches v6 spec §Appendix C cost projection within ~10%).
Observability: per spec §3.10, every extractor call already stamps agent_name on the Langfuse trace. Phase 2b adds the stage:{stage_name} trace tag (per §3.1 + §3.10) so the Metabase "cost per stage" query is a group-by on the existing trace data — no new pipeline needed.
7. Open questions for Phase 2b implementation¶
These need code-level investigation when Phase 2b implementer (Sonnet) lands.
-
medical_extractoroutput schema is the largest in the system (line 60-90 ofmedical_extractor.py— nested patient_demographics, procedure, diagnosis blocks) and is not formally documented as a Pydantic model. Phase 2b should formalise it asapp/schemas/extractor_outputs/medical.pyand pin every consumer to it — the matrix's "output consumer" column is brittle until then. -
intent_extractorconditional gate predicate — thedecision_stage_volatilepredicate in §3 is currently nominal. Phase 2b must decide: is this an LLM-routed check (free text → re-run extractor) or a deterministic compare againstlayer_state.intent_capture? Recommendation: deterministic, since the field is already populated. -
Where is the
extractors_active: [list]field consumed? v6 spec §3.4 mitigation says "orchestrator reads this." Today,triage_agent.py:902hardcodes "active layer always runs + signal gate for others." Phase 2b needs to threadstages.yaml.extractors_activethroughrun_extractors_node— the natural location is betweenstate.get("active_layer")and the task-builder loop (lines 928-947). -
recovery_checkinis the only extractor with a non-Haiku-only failure path (regex keyword scan continues on LLM fail per line 27). Phase 2b should decide: do other extractors get equivalent deterministic fallbacks, or is the empty-delta-on-fail pattern sufficient? Spec §3.4 says "deterministic fallback per agent fallback discipline" but is silent on what that means for extractors that have no obvious deterministic equivalent. -
match_review5-wide gather is the p95 latency hotspot. Phase 2b should add a Langfuse dashboard panel for "match_review extractor pipeline duration" before ramping v6 traffic — the spec §2.5 acceptance criterion is segment cache hit rate, but match_review's bottleneck is extractor parallel-gather latency, not cache. -
supportstage runs INT + MED (per matrix). Per v6 spec §2.5,support"collapses to base + patient_context only" for the main conversation LLM call. Does that exclusion apply to extractors too, or only to the conversation turn? Recommendation: keep extractors warm insupport(cheap insurance against losing signals while resolver is stuck), but Phase 2b must confirm. -
stage_indicatordebug card (per v6 spec §3.7) should expose the resolvedextractors_activelist when the BE-side flag is on. Adds a small admin-UX nicety for triaging Phase 2b wiring bugs. -
mso_second_opinion.yamlmigration from v4 (per §3.5.1 governance item 9) — when Phase 2b lands the addendum file, ensure the v5 path keeps the v4-shaped offer text intact. Regression test: v5 +mso_patient_offer_enabled=trueproduces identical offer wording before/after the migration.
8. Appendix — keys cross-reference¶
stage_id keys MUST be byte-identical between this doc, the truth-table doc, stages.yaml, and app/services/stage_resolver.py. extractor_id keys MUST match the agent_name passed to llm_gateway.invoke() (which today are intent_extractor.extract, medical_extractor.extract, etc. — note the .extract suffix). The matrix above uses the short form (intent_extractor) for readability; the agent_name suffix is preserved per Langfuse trace-tag continuity (no Langfuse query needs to change).