04 — Agent Pipeline Design¶
The agentic layer is the core differentiator. Without it: CRUD app with scoring formula. With it: AI reads medical reports, understands diagnoses, explains treatment options in the patient's language.
Framework Stack¶
| Framework | Role | When |
|---|---|---|
| LangGraph | Agent orchestration — multi-agent StateGraph with typed state | Every patient request |
| LangChain | Tool wrappers for LLM/DB/API calls | Called by LangGraph nodes |
| Langfuse | Production observability — traces, cost, latency + prompt management | Every LLM call |
Agent Orchestrator¶
Endpoint: POST /api/v1/patients/{id}/chat
Single unified interface the frontend calls. Top-level LangGraph StateGraph that classifies intent and routes.
| Patient Intent | Routed To | Example |
|---|---|---|
| Medical report submission | Clinical Context Agent | "I have a knee X-ray report" |
| Treatment search | Match Agent | "Find me the best hospital" |
| Question about results | Explanation Agent | "Why was Apollo recommended?" |
| Getting started / general | Intake Agent | "I need help getting started" |
| Document upload notification | Document handler | (attachment metadata in request) |
Orchestrator State Schema¶
class OrchestratorState(TypedDict):
patient_id: str
tenant_id: str
message: str
conversation_history: list[dict]
intent: str # classified intent
attachments: list[dict] # uploaded document references
case_status: str # current workflow phase
workflow_state: dict # phase-specific state
response: str # agent response to return
suggested_actions: list[dict] # UI action chips
agent_name: str # which agent handled this
Agent Specifications¶
Clinical Context Agent¶
Purpose: Raw medical report text → validated FHIR R4 resources with ICD-10/SNOMED coding.
LangGraph Nodes:
1. extract_clinical_entities — Claude Haiku parses raw text → structured conditions, procedures, medications, allergies
2. map_to_medical_codes — Claude Haiku maps each entity to ICD-10/SNOMED with confidence scores
3. generate_fhir_resources — Generates valid FHIR R4 JSON (Condition, Procedure, AllergyIntolerance, Observation)
4. store_resources — Calls fhir_service.create_fhir_resource() with full R4 schema validation
State: { patient_id, tenant_id, raw_text, report_type, extracted_entities[], coded_entities[], fhir_resources[], stored_resource_ids[], errors[] }
Model: Claude Haiku 4.5 (~$0.01/report). Fallback: GPT-4o mini.
Prompt: System prompt with 2–3 few-shot examples of real radiology reports mapped to ICD codes. Output schema enforced via structured JSON. Managed in Langfuse.
Fallback: 202 Accepted, raw text stored, extraction queued for QStash retry. Patient never blocked.
Intake Agent¶
Purpose: Conversational patient onboarding. Records-first: extracts from documents before asking questions.
LangGraph Nodes:
1. classify_intent — Routes patient message to appropriate handler
2. collect_information — Extracts structured data from conversational input
3. suggest_actions — Recommends next steps (upload X-ray, provide insurance, grant consent)
4. update_progress — Advances intake status, updates patient profile
State: { patient_id, tenant_id, message, conversation_history[], intent, extracted_data{}, suggested_actions[], intake_progress: float }
Model: Claude Haiku 4.5. State in events table (not in-memory). Fallback: Standard form-based intake.
Match Agent¶
Purpose: Wraps deterministic matching engine with AI pre/post-processing.
LangGraph Nodes:
1. analyze_clinical_picture — Claude Sonnet reviews all FHIR resources, generates clinical summary with risk factors
2. determine_requirements — Identifies needed specialties, procedures, accommodations
3. run_weighted_scoring — Calls existing WeightedScoringV1 (or strategy from Flagsmith)
4. rerank_edge_cases — Claude reviews top 5 for comorbidity risks, contraindications
5. generate_explanations — Passes to Explanation Agent
Feature flag: agent_enhanced_matching. When disabled, steps 1,2,4,5 skipped. Zero regression.
Models: Claude Sonnet 4.6 for clinical analysis. Claude Haiku 4.5 for re-ranking.
Explanation Agent¶
Purpose: Natural language match reasoning in patient's preferred locale.
Model: Claude Haiku 4.5. Supports multilingual output via patient.preferred_locale.
Fallback: Template-based string explanations (pre-agent behavior).
Example output for Aisha (Arabic locale):
Based on your knee osteoarthritis diagnosis (ICD M17.11), Apollo Hospitals Chennai is the strongest match. Their orthopedic department has performed over 3,000 knee replacements with a 95% success rate. The hospital supports Arabic-speaking staff and halal dietary options. At approximately $6,000–$8,000 USD, they offer competitive pricing in the India corridor.
Three-Layer Guardrails¶
- Langfuse-managed system prompts — externalized, versioned, A/B testable. Define agent boundaries (DO / DON'T / REDIRECT framework).
- GPT-4o-mini input classifier — lightweight pre-filter classifying messages for safety, relevance, routing before reaching primary agent.
- Regex output validation — post-processing catches PII leakage, hallucinated medical advice, format violations.
All rules externalized to config/guardrails.yaml.
Guardrail Categories¶
guardrails:
blocked_intents:
- medical_advice # "Should I take ibuprofen?"
- diagnosis_speculation # "That sounds like arthritis"
- off_topic # "What's the weather in Bangkok?"
- outcome_prediction # "You'll probably be fine"
redirect_intents:
- emergency # → "Please call your local emergency services"
- existing_treatment # → "Please consult your current physician"
allowed_intents:
- intake_information
- document_upload
- match_query
- explanation_request
- preference_update
- procedure_question # Factual: "What is TKR recovery time?"
Fallback Philosophy¶
Every agent has a deterministic fallback. Platform never broken by LLM failure.
| Agent | Fallback |
|---|---|
| Clinical Context | Store raw text, queue QStash retry. Patient not blocked. |
| Intake | Standard form-based intake (REST endpoints). |
| Match | Pure WeightedScoringV1 without AI enrichment. |
| Explanation | Template-based string explanations. |
| Orchestrator | Direct API calls to individual endpoints. |
Prompt Strategy¶
- All system prompts in
app/agents/prompts/as versioned Python modules - Every prompt includes 2–3 few-shot examples of real medical data
- Output format enforced via JSON schema instructions
- Prompt versions managed via Langfuse prompt management
- Regression testing deferred to post-MVP (needs eval datasets)
Agent Observability¶
| Layer | Tool | Tracks |
|---|---|---|
| Request-level | Events table | agent_name, model, tokens, latency, cost, success/failure, correlation_id |
| Trace-level | Langfuse | Full traces, nested spans, prompt/completion pairs, cost per journey |
| Eval-level | LangSmith (post-MVP) | Offline evals when enough data exists |