Skip to content

v6 Stages × Extractors Matrix

Status: DRAFT (Phase 0 companion artifact) Parent spec: conversation-v6-feature.md §2.5, §3.4, §3.1 Sibling: v6-stage-resolver-truth-table.md (Phase 0 companion) Blocks: v6 Phase 2b (config/prompts/stages.yaml + extractors_active wiring) Owner: Backend / Agents Last updated: 2026-05-12

This document maps every v6 conversation stage to the extractor pipeline that runs during it. It is wiring, not content — prompt authoring lives in Phase 2b. The matrix is the source of truth for the extractors_active: [list] field that each stages.yaml entry carries (per v6 spec §3.4 mitigation).


1. Stage list

The 12 stages below match stage_id keys in the sibling truth-table doc. support is the fallback bucket — every malformed-state or no-rule-matches case lands here.

stage_id Goal
discovery Find out what procedure / why-now / where they're starting from
procedure_identification Confirm specific procedure + laterality / anatomy / mechanism
records_collection Get medical records uploaded for clinical context
match_review Patient reviews matched providers, asks questions, picks subset
consent_capture HIPAA / GDPR / DPDP consent for record forwarding
mso_offer Offer Medical Second Opinion video consult before booking
scheduling Pick MSO video slot / book consultation
pre_travel Logistics, visas, passport, travel readiness
in_treatment During-stay support (mostly coordinator-handled; agent is light-touch)
recovery_offer Post-op recovery facility offer (ADR-0018 §K)
recovery_followup Post-op milestone check-ins (ADR-0018 §K)
support Fallback — malformed state or no rule matches; base + patient_context only

Total: 12 stages (11 forward-flow + 1 fallback).


2. Extractor catalog

Sourced from app/services/extractors/*.py (verified on main 2026-05-12) plus the v6-introduced gate. The existing 5 layer extractors plus recovery_checkin_extractor (landed via PR #832) cover every documented v6 stage. No new extractors are added by v6 — the v6 work is wiring + system-prompt language sweep per spec §3.4.

extractor_id Source file Tier Caching Failure mode
intent_extractor app/services/extractors/intent_extractor.py Haiku 4.5 None (per-turn LLM call; output merged into layer_state.intent_capture) Returns empty delta on LLMGatewayError; layer completion stays at previous value (_base.py:97-99)
medical_extractor app/services/extractors/medical_extractor.py Haiku 4.5 ICD-10/SNOMED code map cached 30 days in Redis under icd_cache:v2:dx=...|proc=... (line 27) Same non-blocking pattern; ICD map degrades to empty list if cache + LLM both fail
travel_extractor app/services/extractors/travel_extractor.py Haiku 4.5 None at extractor level; transport-tier table loaded once at import via scoring_config.get_config Returns empty delta; compute_transport_tier() falls back to T1 if scores all zero
logistics_extractor app/services/extractors/logistics_extractor.py Haiku 4.5 None; VISA_TABLE is static module-level dict (line 48) Returns empty delta; visa derivation returns None for unknown country pairs
financial_extractor app/services/extractors/financial_extractor.py Haiku 4.5 None at extractor level Returns empty delta; qualify_budget() returns incomplete when budget falsy
recovery_checkin app/services/extractors/recovery_checkin_extractor.py Haiku 4.5 Escalation keyword YAML cached via @lru_cache(maxsize=1) (line 69) Deterministic regex keyword pass still runs on LLM failure (line 27); only LLM-derived fields go empty

Input shape (uniform across all six — see _base.run_extraction, lines 49-122):

  • message_text: str — the latest user turn's content
  • existing_data: dict[str, Any] — the current per-layer slice from PatientLayerState
  • tenant_id: str | None + case_id: str | None — propagated for Langfuse session/user tagging

System prompt is static per extractor (hardcoded module-level constant); the v6 §3.4 sweep replaces "layer N" wording with stage-equivalent language but preserves schemas. No patient content embedded in system prompts (§3.4 CI gate tests/test_extractor_prompts_pii_safe.py).

Output schema (uniform envelope; extractor-specific delta inside):

{
  "delta": { /* extractor-specific JSON keys, matches schema in the extractor's SYSTEM_PROMPT */ },
  "completion_estimate": 0.0  // float in [0, 1]
}

The reducer (_merge_layer_deltas, triage_agent.py:323-) merges deltas into layer_state using high-water-mark on completion + deep merge on data.


3. Stage × Extractor matrix

Legend:

  • run — extractor is in this stage's extractors_active list and always invoked when its signal gate fires
  • skip — extractor not wired; orchestrator does not spawn it for this stage
  • cond(<predicate>) — extractor is wired but only invoked when the predicate is true

Column abbreviations: INT=intent, MED=medical, TRV=travel, LOG=logistics, FIN=financial, REC=recovery_checkin.

Stage INT MED TRV LOG FIN REC
discovery run run skip skip skip skip
procedure_identification run run skip skip skip skip
records_collection run run skip skip skip skip
match_review cond(decision_stage_volatile) run run run run skip
consent_capture cond(decision_stage_volatile) run skip skip skip skip
mso_offer run run skip skip run skip
scheduling skip run run run run skip
pre_travel skip run run run run skip
in_treatment skip run skip skip skip skip
recovery_offer skip run run skip run skip
recovery_followup skip run skip skip skip run
support run run skip skip skip skip

run cell count: 30 across 12 stages × 6 extractors (= 72 cells total; 30 run, 2 cond, 40 skip).

Conditional predicatedecision_stage_volatile:

intent.decision_stage in {comparing_options, just_exploring}
   OR
patient_state.layer_state.intent_capture.completion < 0.8

Used in match_review + consent_capture to re-run intent_extractor when the patient is still pivoting (per v6 spec §2.6 "backward stage moves" — match_reviewprocedure_identification is intentional when the patient pivots procedures). When intent is locked, skip the extractor to save a Haiku call.

3.1 Per-run cell: input slice + output consumer

For each run cell, the input is always the latest user message + the existing layer slice (per _base.run_extraction). The differentiator is which triage decision in the stage consumes the merged output. Mapping below uses stage_id rows; entries in italic are stages where the extractor runs but is not load-bearing for the stage's primary advance predicate (signal-gated catch).

Cell Output consumer in this stage
discovery × INT procedure_identified flag (workflow_state); urgency / emotional readiness for guidance tone
discovery × MED First-pass procedure name → flips procedure_identified → advances to procedure_identification
procedure_identification × INT Refines decision_stage (just_exploring → comparing_options) — feeds match_review gate
procedure_identification × MED Confirms procedure + body_site/laterality + comorbidities → unblocks records_collection
records_collection × INT Catch — signal-gated; runs if patient mentions urgency or new fear while uploading
records_collection × MED Symptoms, medications, allergies, age — populates FHIR observations + advances medical_status.completion ≥ 0.7
match_review × INT (cond) Detects pivot (decision_stagecomparing_options) → triggers backward stage move per §2.6
match_review × MED New comorbidity / medication captured mid-review → re-scores match weights
match_review × TRV First time mobility / oxygen surface — transport tier feeds match relevance
match_review × LOG Country preferences + timeline → narrows shortlist
match_review × FIN Budget band + insurance preauth → match qualification (qualified / stretch / mismatched)
consent_capture × INT (cond) Detects last-minute hesitation → guidance softens, never re-asks consent until intent stabilises
consent_capture × MED Catch — captures any new comorbidity revealed during consent reading; FHIR addendum
mso_offer × INT Reads primary_fear → MSO offer framing (fear of complications → emphasise second opinion)
mso_offer × MED Catch — symptoms surfaced during offer can re-rank consult urgency
mso_offer × FIN Insurance coverage of MSO consult → affects offer wording (covered vs out-of-pocket)
scheduling × MED Time-zone / fitness affecting consult timing
scheduling × TRV Mobility → in-person vs video consult preference
scheduling × LOG Timezone (country_of_residence) → consult slot proposal
scheduling × FIN Catch — payment method surfaced during scheduling triggers payment-link card
pre_travel × MED Recent medication changes affecting travel fitness
pre_travel × TRV Transport tier (T1-T4) → travel-readiness checklist
pre_travel × LOG Visa derivation + companion + timeline → pre-travel readiness card
pre_travel × FIN Catch — final cost confirmation before flights
in_treatment × MED Light-touch: medication / symptom updates flowing to coordinator
recovery_offer × MED Procedure category determines recovery-facility eligibility
recovery_offer × TRV Current mobility → recovery facility tier match
recovery_offer × FIN Recovery facility cost vs budget → offer framing
recovery_followup × MED Medication adherence, symptom check via free text
recovery_followup × REC Primary: pain_level + escalation keywords → Telegram alert + coordinator escalation (ADR-0018 §K)
support × INT Captures any signal the patient drops while resolver is stuck — kept warm so recovery from support is fast
support × MED Same — opportunistic capture; never advances stage from support alone

The cells marked catch run via the existing signal-gate mechanism (detect_layer_signals, triage_agent.py:297-) — already production-shipped behind flag extractor_signal_gating_enabled. v6 re-uses this gate verbatim. Phase 2b only declares extractors_active in stages.yaml; the gate handles the conditional skip.


4. Knowledge addendum gating

Per v6 spec §2.3 + §2.5 hard constraint: at most 1 knowledge addendum attaches per turn (Anthropic 4-cache-breakpoint limit). Addendums are triggered by patient context, not by stage — but stage interacts with the trigger predicate. Cross-reference with the sibling truth-table doc:

Addendum Trigger predicate (from §2.3) Stages where this addendum is most likely to attach
procedure_clinical_facts/{procedure_slug}.yaml patient.procedure identified procedure_identification, records_collection, match_review, mso_offer, consent_capture
financial_options.yaml Patient asked about cost OR budget tier undetermined match_review, mso_offer, scheduling, pre_travel, recovery_offer
post_travel_logistics.yaml stage_active in {pre_travel, in_treatment} AND patient.passport_status != confirmed pre_travel, in_treatment (predicate is stage-scoped)
insurance_handling.yaml Patient mentioned insurance OR funding_source == insurance match_review, consent_capture, mso_offer, scheduling, recovery_offer
mso_second_opinion.yaml (migrated from v4 line 204) mso_patient_offer_enabled flag true AND stage in {mso_offer, scheduling} mso_offer, scheduling

Priority ordering (per v6 spec §2.3 — clinical-safety addendums above commercial):

  1. procedure_clinical_facts (clinical) — highest priority
  2. post_travel_logistics (clinical-adjacent — fitness-to-travel)
  3. insurance_handling (commercial)
  4. financial_options (commercial)
  5. mso_second_opinion (commercial) — lowest priority

When multiple match, the highest-priority addendum wins; the rest are dropped that turn (§2.3 cap of 1).

Stage-by-stage default addendum (the one that fires most often, given typical patient state at that stage):

Stage Default addendum Notes
discovery none Procedure not yet identified; no triggers met
procedure_identification procedure_clinical_facts/{slug}.yaml (once procedure surfaces) Drops the moment medical_extractor pins procedure.name
records_collection procedure_clinical_facts/{slug}.yaml Stable
match_review financial_options.yaml OR procedure_clinical_facts/{slug} Tie broken by priority — clinical wins
consent_capture insurance_handling.yaml (if insurance) else procedure_clinical_facts/{slug} Stable
mso_offer mso_second_opinion.yaml (gated by mso_patient_offer_enabled) Falls back to procedure_clinical_facts if MSO offer off
scheduling financial_options.yaml
pre_travel post_travel_logistics.yaml Predicate already stage-scoped
in_treatment post_travel_logistics.yaml (residual logistics issues) Light-touch overall
recovery_offer financial_options.yaml Cost is the chief recovery-offer concern
recovery_followup none Recovery turn is intentionally narrow — extractor + base only
support none Per §2.5 — support collapses to base + patient_context only

The full predicate gating belongs in app/agents/knowledge_addendum_resolver.py (new in Phase 2b); this table is the cheat-sheet for the resolver.


5. Parallel execution graph

Per triage_agent.py:953-957 (asyncio.gather(...) on the extractor task list), today's pipeline runs all signaled extractors in parallel within a single map-reduce node. v6 preserves this; the only delta is which extractors enter the gather group.

Hard data dependency (one rule today, one rule v6 may add):

  1. medical_extractor reads intent classification implicitly (procedure framing depends on whether decision_stage indicates the patient is asserting vs exploring). Today this is resolved by the LLM reading existing_data (the prior layer_state), so the order is "previous turn's intent informs this turn's medical." There is no in-turn dependency that would prevent parallel execution; the LLM reads existing_data snapshot.

  2. recovery_checkin is independent of all others — its escalation logic (regex keyword scan) is deterministic and does not consume other extractors' deltas. It can run in parallel with medical_extractor in recovery_followup.

Per-stage parallel groups (everything within a row runs concurrently via asyncio.gather):

Stage Parallel group Critical path
discovery {INT, MED} max(haiku_latency × 2 calls)
procedure_identification {INT, MED} max(haiku) — both Haiku
records_collection {INT, MED} max(haiku) + ICD cache lookup
match_review {INT(cond), MED, TRV, LOG, FIN} max(haiku) — 5-wide gather
consent_capture {INT(cond), MED} max(haiku)
mso_offer {INT, MED, FIN} max(haiku)
scheduling {MED, TRV, LOG, FIN} max(haiku) — 4-wide
pre_travel {MED, TRV, LOG, FIN} max(haiku) — 4-wide
in_treatment {MED} haiku
recovery_offer {MED, TRV, FIN} max(haiku)
recovery_followup {MED, REC} max(haiku, regex_keyword_scan)
support {INT, MED} max(haiku)

Per-Haiku-call latency budget: ~600-900 ms p50, ~1500 ms p95 (per current Langfuse data, model_registry routes all to claude-haiku-4.5). The 5-wide match_review gather is the worst case but still ~1500 ms p95 since all five Haiku calls run concurrently.

Future: when a future extractor adds an explicit consumer of another's delta (e.g., a hypothetical risk_extractor that reads medical.comorbidities), the gather group must split into a two-phase DAG. Out of scope for v6.


6. Token budget per stage

Per extractor call (averaged across the 6 extractors at Haiku tier): - Input: ~1,200 tokens (system_prompt ~800 + user_content with existing_data + message ~400) - Output: ~400 tokens (JSON delta + filler) - Per call cost: ~$0.0002 (Haiku 4.5 published rates, see docs/reference/llm-evaluation.md)

Per-stage extractor cost = (count of run cells in row) × ~$0.0002.

Stage Extractor calls (steady state) Extractor input tokens Extractor output tokens Cost (¢, Haiku)
discovery 2 2,400 800 ~0.04
procedure_identification 2 2,400 800 ~0.04
records_collection 2 2,400 800 ~0.04
match_review 4-5 (cond gate on INT) 4,800-6,000 1,600-2,000 ~0.08-0.10
consent_capture 1-2 (cond gate on INT) 1,200-2,400 400-800 ~0.02-0.04
mso_offer 3 3,600 1,200 ~0.06
scheduling 4 4,800 1,600 ~0.08
pre_travel 4 4,800 1,600 ~0.08
in_treatment 1 1,200 400 ~0.02
recovery_offer 3 3,600 1,200 ~0.06
recovery_followup 2 2,400 800 ~0.04
support 2 2,400 800 ~0.04

Per-turn extractor token budget (worst case, match_review): ~6,000 in + 2,000 out (extractor pipeline only).

Total per-turn cost (extractor pipeline + main conversation LLM call) at worst-case match_review: - Extractor pipeline: ~$0.0010 (5 Haiku calls) - Main conversation call (per §2.5): ~4,400 input tok + 600 output tok at Haiku = ~$0.0008 - Combined: ~$0.0018 / turn at the most expensive stage

Steady-state per-turn budget across all stages (weighted by stage occupancy): ~$0.0010 / turn (matches v6 spec §Appendix C cost projection within ~10%).

Observability: per spec §3.10, every extractor call already stamps agent_name on the Langfuse trace. Phase 2b adds the stage:{stage_name} trace tag (per §3.1 + §3.10) so the Metabase "cost per stage" query is a group-by on the existing trace data — no new pipeline needed.


7. Open questions for Phase 2b implementation

These need code-level investigation when Phase 2b implementer (Sonnet) lands.

  1. medical_extractor output schema is the largest in the system (line 60-90 of medical_extractor.py — nested patient_demographics, procedure, diagnosis blocks) and is not formally documented as a Pydantic model. Phase 2b should formalise it as app/schemas/extractor_outputs/medical.py and pin every consumer to it — the matrix's "output consumer" column is brittle until then.

  2. intent_extractor conditional gate predicate — the decision_stage_volatile predicate in §3 is currently nominal. Phase 2b must decide: is this an LLM-routed check (free text → re-run extractor) or a deterministic compare against layer_state.intent_capture? Recommendation: deterministic, since the field is already populated.

  3. Where is the extractors_active: [list] field consumed? v6 spec §3.4 mitigation says "orchestrator reads this." Today, triage_agent.py:902 hardcodes "active layer always runs + signal gate for others." Phase 2b needs to thread stages.yaml.extractors_active through run_extractors_node — the natural location is between state.get("active_layer") and the task-builder loop (lines 928-947).

  4. recovery_checkin is the only extractor with a non-Haiku-only failure path (regex keyword scan continues on LLM fail per line 27). Phase 2b should decide: do other extractors get equivalent deterministic fallbacks, or is the empty-delta-on-fail pattern sufficient? Spec §3.4 says "deterministic fallback per agent fallback discipline" but is silent on what that means for extractors that have no obvious deterministic equivalent.

  5. match_review 5-wide gather is the p95 latency hotspot. Phase 2b should add a Langfuse dashboard panel for "match_review extractor pipeline duration" before ramping v6 traffic — the spec §2.5 acceptance criterion is segment cache hit rate, but match_review's bottleneck is extractor parallel-gather latency, not cache.

  6. support stage runs INT + MED (per matrix). Per v6 spec §2.5, support "collapses to base + patient_context only" for the main conversation LLM call. Does that exclusion apply to extractors too, or only to the conversation turn? Recommendation: keep extractors warm in support (cheap insurance against losing signals while resolver is stuck), but Phase 2b must confirm.

  7. stage_indicator debug card (per v6 spec §3.7) should expose the resolved extractors_active list when the BE-side flag is on. Adds a small admin-UX nicety for triaging Phase 2b wiring bugs.

  8. mso_second_opinion.yaml migration from v4 (per §3.5.1 governance item 9) — when Phase 2b lands the addendum file, ensure the v5 path keeps the v4-shaped offer text intact. Regression test: v5 + mso_patient_offer_enabled=true produces identical offer wording before/after the migration.


8. Appendix — keys cross-reference

stage_id keys MUST be byte-identical between this doc, the truth-table doc, stages.yaml, and app/services/stage_resolver.py. extractor_id keys MUST match the agent_name passed to llm_gateway.invoke() (which today are intent_extractor.extract, medical_extractor.extract, etc. — note the .extract suffix). The matrix above uses the short form (intent_extractor) for readability; the agent_name suffix is preserved per Langfuse trace-tag continuity (no Langfuse query needs to change).