v6 Stages × Extractors Matrix¶

Status: DRAFT (Phase 0 companion artifact) Parent spec: conversation-v6-feature.md §2.5, §3.4, §3.1 Sibling: v6-stage-resolver-truth-table.md (Phase 0 companion) Blocks: v6 Phase 2b (config/prompts/stages.yaml + extractors_active wiring) Owner: Backend / Agents Last updated: 2026-05-12

This document maps every v6 conversation stage to the extractor pipeline that runs during it. It is wiring, not content — prompt authoring lives in Phase 2b. The matrix is the source of truth for the extractors_active: [list] field that each stages.yaml entry carries (per v6 spec §3.4 mitigation).

1. Stage list¶

The 12 stages below match stage_id keys in the sibling truth-table doc. support is the fallback bucket — every malformed-state or no-rule-matches case lands here.

`stage_id`	Goal
`discovery`	Find out what procedure / why-now / where they're starting from
`procedure_identification`	Confirm specific procedure + laterality / anatomy / mechanism
`records_collection`	Get medical records uploaded for clinical context
`match_review`	Patient reviews matched providers, asks questions, picks subset
`consent_capture`	HIPAA / GDPR / DPDP consent for record forwarding
`mso_offer`	Offer Medical Second Opinion video consult before booking
`scheduling`	Pick MSO video slot / book consultation
`pre_travel`	Logistics, visas, passport, travel readiness
`in_treatment`	During-stay support (mostly coordinator-handled; agent is light-touch)
`recovery_offer`	Post-op recovery facility offer (ADR-0018 §K)
`recovery_followup`	Post-op milestone check-ins (ADR-0018 §K)
`support`	Fallback — malformed state or no rule matches; base + patient_context only

Total: 12 stages (11 forward-flow + 1 fallback).

2. Extractor catalog¶

Sourced from app/services/extractors/*.py (verified on main 2026-05-12) plus the v6-introduced gate. The existing 5 layer extractors plus recovery_checkin_extractor (landed via PR #832) cover every documented v6 stage. No new extractors are added by v6 — the v6 work is wiring + system-prompt language sweep per spec §3.4.

`extractor_id`	Source file	Tier	Caching	Failure mode
`intent_extractor`	`app/services/extractors/intent_extractor.py`	Haiku 4.5	None (per-turn LLM call; output merged into `layer_state.intent_capture`)	Returns empty delta on `LLMGatewayError`; layer completion stays at previous value (`_base.py:97-99`)
`medical_extractor`	`app/services/extractors/medical_extractor.py`	Haiku 4.5	ICD-10/SNOMED code map cached 30 days in Redis under `icd_cache:v2:dx=...\|proc=...` (line 27)	Same non-blocking pattern; ICD map degrades to empty list if cache + LLM both fail
`travel_extractor`	`app/services/extractors/travel_extractor.py`	Haiku 4.5	None at extractor level; transport-tier table loaded once at import via `scoring_config.get_config`	Returns empty delta; `compute_transport_tier()` falls back to `T1` if scores all zero
`logistics_extractor`	`app/services/extractors/logistics_extractor.py`	Haiku 4.5	None; `VISA_TABLE` is static module-level dict (line 48)	Returns empty delta; visa derivation returns `None` for unknown country pairs
`financial_extractor`	`app/services/extractors/financial_extractor.py`	Haiku 4.5	None at extractor level	Returns empty delta; `qualify_budget()` returns `incomplete` when budget falsy
`recovery_checkin`	`app/services/extractors/recovery_checkin_extractor.py`	Haiku 4.5	Escalation keyword YAML cached via `@lru_cache(maxsize=1)` (line 69)	Deterministic regex keyword pass still runs on LLM failure (line 27); only LLM-derived fields go empty

Input shape (uniform across all six — see _base.run_extraction, lines 49-122):

message_text: str — the latest user turn's content
existing_data: dict[str, Any] — the current per-layer slice from PatientLayerState
tenant_id: str | None + case_id: str | None — propagated for Langfuse session/user tagging

System prompt is static per extractor (hardcoded module-level constant); the v6 §3.4 sweep replaces "layer N" wording with stage-equivalent language but preserves schemas. No patient content embedded in system prompts (§3.4 CI gate tests/test_extractor_prompts_pii_safe.py).

Output schema (uniform envelope; extractor-specific delta inside):

{
  "delta": { /* extractor-specific JSON keys, matches schema in the extractor's SYSTEM_PROMPT */ },
  "completion_estimate": 0.0  // float in [0, 1]
}

The reducer (_merge_layer_deltas, triage_agent.py:323-) merges deltas into layer_state using high-water-mark on completion + deep merge on data.

3. Stage × Extractor matrix¶

Legend:

run — extractor is in this stage's extractors_active list and always invoked when its signal gate fires
skip — extractor not wired; orchestrator does not spawn it for this stage
cond(<predicate>) — extractor is wired but only invoked when the predicate is true

Column abbreviations: INT=intent, MED=medical, TRV=travel, LOG=logistics, FIN=financial, REC=recovery_checkin.

Stage	INT	MED	TRV	LOG	FIN	REC
`discovery`	run	run	skip	skip	skip	skip
`procedure_identification`	run	run	skip	skip	skip	skip
`records_collection`	run	run	skip	skip	skip	skip
`match_review`	cond(decision_stage_volatile)	run	run	run	run	skip
`consent_capture`	cond(decision_stage_volatile)	run	skip	skip	skip	skip
`mso_offer`	run	run	skip	skip	run	skip
`scheduling`	skip	run	run	run	run	skip
`pre_travel`	skip	run	run	run	run	skip
`in_treatment`	skip	run	skip	skip	skip	skip
`recovery_offer`	skip	run	run	skip	run	skip
`recovery_followup`	skip	run	skip	skip	skip	run
`support`	run	run	skip	skip	skip	skip

run cell count: 30 across 12 stages × 6 extractors (= 72 cells total; 30 run, 2 cond, 40 skip).

Conditional predicate — decision_stage_volatile:

intent.decision_stage in {comparing_options, just_exploring}
   OR
patient_state.layer_state.intent_capture.completion < 0.8

Used in match_review + consent_capture to re-run intent_extractor when the patient is still pivoting (per v6 spec §2.6 "backward stage moves" — match_review → procedure_identification is intentional when the patient pivots procedures). When intent is locked, skip the extractor to save a Haiku call.

3.1 Per-`run` cell: input slice + output consumer¶

For each run cell, the input is always the latest user message + the existing layer slice (per _base.run_extraction). The differentiator is which triage decision in the stage consumes the merged output. Mapping below uses stage_id rows; entries in italic are stages where the extractor runs but is not load-bearing for the stage's primary advance predicate (signal-gated catch).

Cell	Output consumer in this stage
`discovery` × INT	`procedure_identified` flag (workflow_state); urgency / emotional readiness for guidance tone
`discovery` × MED	First-pass procedure name → flips `procedure_identified` → advances to `procedure_identification`
`procedure_identification` × INT	Refines `decision_stage` (just_exploring → comparing_options) — feeds `match_review` gate
`procedure_identification` × MED	Confirms procedure + body_site/laterality + comorbidities → unblocks `records_collection`
`records_collection` × INT	Catch — signal-gated; runs if patient mentions urgency or new fear while uploading
`records_collection` × MED	Symptoms, medications, allergies, age — populates FHIR observations + advances `medical_status.completion` ≥ 0.7
`match_review` × INT (cond)	Detects pivot (`decision_stage` → `comparing_options`) → triggers backward stage move per §2.6
`match_review` × MED	New comorbidity / medication captured mid-review → re-scores match weights
`match_review` × TRV	First time mobility / oxygen surface — transport tier feeds match relevance
`match_review` × LOG	Country preferences + timeline → narrows shortlist
`match_review` × FIN	Budget band + insurance preauth → match qualification (`qualified` / `stretch` / `mismatched`)
`consent_capture` × INT (cond)	Detects last-minute hesitation → guidance softens, never re-asks consent until intent stabilises
`consent_capture` × MED	Catch — captures any new comorbidity revealed during consent reading; FHIR addendum
`mso_offer` × INT	Reads `primary_fear` → MSO offer framing (fear of complications → emphasise second opinion)
`mso_offer` × MED	Catch — symptoms surfaced during offer can re-rank consult urgency
`mso_offer` × FIN	Insurance coverage of MSO consult → affects offer wording (covered vs out-of-pocket)
`scheduling` × MED	Time-zone / fitness affecting consult timing
`scheduling` × TRV	Mobility → in-person vs video consult preference
`scheduling` × LOG	Timezone (country_of_residence) → consult slot proposal
`scheduling` × FIN	Catch — payment method surfaced during scheduling triggers payment-link card
`pre_travel` × MED	Recent medication changes affecting travel fitness
`pre_travel` × TRV	Transport tier (T1-T4) → travel-readiness checklist
`pre_travel` × LOG	Visa derivation + companion + timeline → pre-travel readiness card
`pre_travel` × FIN	Catch — final cost confirmation before flights
`in_treatment` × MED	Light-touch: medication / symptom updates flowing to coordinator
`recovery_offer` × MED	Procedure category determines recovery-facility eligibility
`recovery_offer` × TRV	Current mobility → recovery facility tier match
`recovery_offer` × FIN	Recovery facility cost vs budget → offer framing
`recovery_followup` × MED	Medication adherence, symptom check via free text
`recovery_followup` × REC	Primary: pain_level + escalation keywords → Telegram alert + coordinator escalation (ADR-0018 §K)
`support` × INT	Captures any signal the patient drops while resolver is stuck — kept warm so recovery from `support` is fast
`support` × MED	Same — opportunistic capture; never advances stage from `support` alone

The cells marked catch run via the existing signal-gate mechanism (detect_layer_signals, triage_agent.py:297-) — already production-shipped behind flag extractor_signal_gating_enabled. v6 re-uses this gate verbatim. Phase 2b only declares extractors_active in stages.yaml; the gate handles the conditional skip.

4. Knowledge addendum gating¶

Per v6 spec §2.3 + §2.5 hard constraint: at most 1 knowledge addendum attaches per turn (Anthropic 4-cache-breakpoint limit). Addendums are triggered by patient context, not by stage — but stage interacts with the trigger predicate. Cross-reference with the sibling truth-table doc:

Addendum	Trigger predicate (from §2.3)	Stages where this addendum is most likely to attach
`procedure_clinical_facts/{procedure_slug}.yaml`	`patient.procedure` identified	`procedure_identification`, `records_collection`, `match_review`, `mso_offer`, `consent_capture`
`financial_options.yaml`	Patient asked about cost OR budget tier undetermined	`match_review`, `mso_offer`, `scheduling`, `pre_travel`, `recovery_offer`
`post_travel_logistics.yaml`	`stage_active in {pre_travel, in_treatment}` AND `patient.passport_status != confirmed`	`pre_travel`, `in_treatment` (predicate is stage-scoped)
`insurance_handling.yaml`	Patient mentioned insurance OR `funding_source == insurance`	`match_review`, `consent_capture`, `mso_offer`, `scheduling`, `recovery_offer`
`mso_second_opinion.yaml` (migrated from v4 line 204)	`mso_patient_offer_enabled` flag true AND stage in `{mso_offer, scheduling}`	`mso_offer`, `scheduling`

Priority ordering (per v6 spec §2.3 — clinical-safety addendums above commercial):

procedure_clinical_facts (clinical) — highest priority
post_travel_logistics (clinical-adjacent — fitness-to-travel)
insurance_handling (commercial)
financial_options (commercial)
mso_second_opinion (commercial) — lowest priority

When multiple match, the highest-priority addendum wins; the rest are dropped that turn (§2.3 cap of 1).

Stage-by-stage default addendum (the one that fires most often, given typical patient state at that stage):

Stage	Default addendum	Notes
`discovery`	none	Procedure not yet identified; no triggers met
`procedure_identification`	`procedure_clinical_facts/{slug}.yaml` (once procedure surfaces)	Drops the moment medical_extractor pins `procedure.name`
`records_collection`	`procedure_clinical_facts/{slug}.yaml`	Stable
`match_review`	`financial_options.yaml` OR `procedure_clinical_facts/{slug}`	Tie broken by priority — clinical wins
`consent_capture`	`insurance_handling.yaml` (if insurance) else `procedure_clinical_facts/{slug}`	Stable
`mso_offer`	`mso_second_opinion.yaml` (gated by `mso_patient_offer_enabled`)	Falls back to `procedure_clinical_facts` if MSO offer off
`scheduling`	`financial_options.yaml`
`pre_travel`	`post_travel_logistics.yaml`	Predicate already stage-scoped
`in_treatment`	`post_travel_logistics.yaml` (residual logistics issues)	Light-touch overall
`recovery_offer`	`financial_options.yaml`	Cost is the chief recovery-offer concern
`recovery_followup`	none	Recovery turn is intentionally narrow — extractor + base only
`support`	none	Per §2.5 — `support` collapses to base + patient_context only

The full predicate gating belongs in app/agents/knowledge_addendum_resolver.py (new in Phase 2b); this table is the cheat-sheet for the resolver.

5. Parallel execution graph¶

Per triage_agent.py:953-957 (asyncio.gather(...) on the extractor task list), today's pipeline runs all signaled extractors in parallel within a single map-reduce node. v6 preserves this; the only delta is which extractors enter the gather group.

Hard data dependency (one rule today, one rule v6 may add):

medical_extractor reads intent classification implicitly (procedure framing depends on whether decision_stage indicates the patient is asserting vs exploring). Today this is resolved by the LLM reading existing_data (the prior layer_state), so the order is "previous turn's intent informs this turn's medical." There is no in-turn dependency that would prevent parallel execution; the LLM reads existing_data snapshot.
recovery_checkin is independent of all others — its escalation logic (regex keyword scan) is deterministic and does not consume other extractors' deltas. It can run in parallel with medical_extractor in recovery_followup.

Per-stage parallel groups (everything within a row runs concurrently via asyncio.gather):

Stage	Parallel group	Critical path
`discovery`	{INT, MED}	max(haiku_latency × 2 calls)
`procedure_identification`	{INT, MED}	max(haiku) — both Haiku
`records_collection`	{INT, MED}	max(haiku) + ICD cache lookup
`match_review`	{INT(cond), MED, TRV, LOG, FIN}	max(haiku) — 5-wide gather
`consent_capture`	{INT(cond), MED}	max(haiku)
`mso_offer`	{INT, MED, FIN}	max(haiku)
`scheduling`	{MED, TRV, LOG, FIN}	max(haiku) — 4-wide
`pre_travel`	{MED, TRV, LOG, FIN}	max(haiku) — 4-wide
`in_treatment`	{MED}	haiku
`recovery_offer`	{MED, TRV, FIN}	max(haiku)
`recovery_followup`	{MED, REC}	max(haiku, regex_keyword_scan)
`support`	{INT, MED}	max(haiku)

Per-Haiku-call latency budget: ~600-900 ms p50, ~1500 ms p95 (per current Langfuse data, model_registry routes all to claude-haiku-4.5). The 5-wide match_review gather is the worst case but still ~1500 ms p95 since all five Haiku calls run concurrently.

Future: when a future extractor adds an explicit consumer of another's delta (e.g., a hypothetical risk_extractor that reads medical.comorbidities), the gather group must split into a two-phase DAG. Out of scope for v6.

6. Token budget per stage¶

Per extractor call (averaged across the 6 extractors at Haiku tier): - Input: ~1,200 tokens (system_prompt ~800 + user_content with existing_data + message ~400) - Output: ~400 tokens (JSON delta + filler) - Per call cost: ~$0.0002 (Haiku 4.5 published rates, see docs/reference/llm-evaluation.md)

Per-stage extractor cost = (count of run cells in row) × ~$0.0002.

Stage	Extractor calls (steady state)	Extractor input tokens	Extractor output tokens	Cost (¢, Haiku)
`discovery`	2	2,400	800	~0.04
`procedure_identification`	2	2,400	800	~0.04
`records_collection`	2	2,400	800	~0.04
`match_review`	4-5 (cond gate on INT)	4,800-6,000	1,600-2,000	~0.08-0.10
`consent_capture`	1-2 (cond gate on INT)	1,200-2,400	400-800	~0.02-0.04
`mso_offer`	3	3,600	1,200	~0.06
`scheduling`	4	4,800	1,600	~0.08
`pre_travel`	4	4,800	1,600	~0.08
`in_treatment`	1	1,200	400	~0.02
`recovery_offer`	3	3,600	1,200	~0.06
`recovery_followup`	2	2,400	800	~0.04
`support`	2	2,400	800	~0.04

Per-turn extractor token budget (worst case, match_review): ~6,000 in + 2,000 out (extractor pipeline only).

Total per-turn cost (extractor pipeline + main conversation LLM call) at worst-case match_review: - Extractor pipeline: ~$0.0010 (5 Haiku calls) - Main conversation call (per §2.5): ~4,400 input tok + 600 output tok at Haiku = ~$0.0008 - Combined: ~$0.0018 / turn at the most expensive stage

Steady-state per-turn budget across all stages (weighted by stage occupancy): ~$0.0010 / turn (matches v6 spec §Appendix C cost projection within ~10%).

Observability: per spec §3.10, every extractor call already stamps agent_name on the Langfuse trace. Phase 2b adds the stage:{stage_name} trace tag (per §3.1 + §3.10) so the Metabase "cost per stage" query is a group-by on the existing trace data — no new pipeline needed.

7. Open questions for Phase 2b implementation¶

These need code-level investigation when Phase 2b implementer (Sonnet) lands.

medical_extractor output schema is the largest in the system (line 60-90 of medical_extractor.py — nested patient_demographics, procedure, diagnosis blocks) and is not formally documented as a Pydantic model. Phase 2b should formalise it as app/schemas/extractor_outputs/medical.py and pin every consumer to it — the matrix's "output consumer" column is brittle until then.
intent_extractor conditional gate predicate — the decision_stage_volatile predicate in §3 is currently nominal. Phase 2b must decide: is this an LLM-routed check (free text → re-run extractor) or a deterministic compare against layer_state.intent_capture? Recommendation: deterministic, since the field is already populated.
Where is the extractors_active: [list] field consumed? v6 spec §3.4 mitigation says "orchestrator reads this." Today, triage_agent.py:902 hardcodes "active layer always runs + signal gate for others." Phase 2b needs to thread stages.yaml.extractors_active through run_extractors_node — the natural location is between state.get("active_layer") and the task-builder loop (lines 928-947).
recovery_checkin is the only extractor with a non-Haiku-only failure path (regex keyword scan continues on LLM fail per line 27). Phase 2b should decide: do other extractors get equivalent deterministic fallbacks, or is the empty-delta-on-fail pattern sufficient? Spec §3.4 says "deterministic fallback per agent fallback discipline" but is silent on what that means for extractors that have no obvious deterministic equivalent.
match_review 5-wide gather is the p95 latency hotspot. Phase 2b should add a Langfuse dashboard panel for "match_review extractor pipeline duration" before ramping v6 traffic — the spec §2.5 acceptance criterion is segment cache hit rate, but match_review's bottleneck is extractor parallel-gather latency, not cache.
support stage runs INT + MED (per matrix). Per v6 spec §2.5, support "collapses to base + patient_context only" for the main conversation LLM call. Does that exclusion apply to extractors too, or only to the conversation turn? Recommendation: keep extractors warm in support (cheap insurance against losing signals while resolver is stuck), but Phase 2b must confirm.
stage_indicator debug card (per v6 spec §3.7) should expose the resolved extractors_active list when the BE-side flag is on. Adds a small admin-UX nicety for triaging Phase 2b wiring bugs.
mso_second_opinion.yaml migration from v4 (per §3.5.1 governance item 9) — when Phase 2b lands the addendum file, ensure the v5 path keeps the v4-shaped offer text intact. Regression test: v5 + mso_patient_offer_enabled=true produces identical offer wording before/after the migration.

8. Appendix — keys cross-reference¶

stage_id keys MUST be byte-identical between this doc, the truth-table doc, stages.yaml, and app/services/stage_resolver.py. extractor_id keys MUST match the agent_name passed to llm_gateway.invoke() (which today are intent_extractor.extract, medical_extractor.extract, etc. — note the .extract suffix). The matrix above uses the short form (intent_extractor) for readability; the agent_name suffix is preserved per Langfuse trace-tag continuity (no Langfuse query needs to change).