Agent System¶
Overview¶
Curaway's AI agent system is the brain of the platform. It orchestrates multi-step workflows that combine clinical understanding, patient interaction, provider matching, and natural-language explanation. The system is built on three pillars:
- LangGraph -- Orchestration framework for multi-node, stateful AI workflows
- LangChain -- Tool wrappers that give agents access to databases, APIs, and external services
- Langfuse -- Observability platform for tracing, prompt management, and cost tracking
Healthcare Safety Principle
Every agent has a deterministic fallback path. If an LLM call fails, times out, or returns invalid output, the system falls back to rule-based logic. Healthcare workflows cannot be broken by LLM failure.
Architecture¶
Single Entry Point¶
All patient interactions flow through one API endpoint:
@router.post("/api/v1/cases/{case_id}/chat")
async def chat(
case_id: UUID,
request: ChatRequest,
tenant_id: str = Header(alias="X-Tenant-ID"),
):
"""Single entry point for all case-related conversations."""
orchestrator = CaseOrchestrator(case_id=case_id, tenant_id=tenant_id)
response = await orchestrator.process(request.message)
return ChatResponse(message=response.content, phase=response.phase)
Orchestrator¶
The orchestrator manages 8 workflow phases, routing messages to the appropriate sub-agent based on the current case state:
graph TD
Entry[POST /chat] --> Orch[Orchestrator]
Orch --> Phase1[1. Initial Contact]
Orch --> Phase2[2. Document Upload]
Orch --> Phase3[3. Clinical Review]
Orch --> Phase4[4. Intake Collection]
Orch --> Phase5[5. Matching]
Orch --> Phase6[6. Results Presentation]
Orch --> Phase7[7. Consultation Scheduling]
Orch --> Phase8[8. Follow-up]
Phase2 --> CCA[Clinical Context Agent]
Phase4 --> IA[Intake Agent]
Phase5 --> MA[Match Agent]
Phase6 --> EA[Explanation Agent]
style Orch fill:#008B8B,color:#fff
style CCA fill:#FF7F50,color:#fff
style IA fill:#FF7F50,color:#fff
style MA fill:#FF7F50,color:#fff
style EA fill:#FF7F50,color:#fff
| Phase | Agent | Description |
|---|---|---|
| 1. Initial Contact | Orchestrator (direct) | Greeting, case creation, basic info |
| 2. Document Upload | Clinical Context Agent | Process uploaded medical documents |
| 3. Clinical Review | Clinical Context Agent | Extract entities, generate FHIR resources |
| 4. Intake Collection | Intake Agent | Gather preferences, constraints, requirements |
| 5. Matching | Match Agent | Run provider matching algorithm |
| 6. Results Presentation | Explanation Agent | Present matches with explanations |
| 7. Consultation Scheduling | Orchestrator (direct) | Book video consults with matched doctors |
| 8. Follow-up | Orchestrator (direct) | Post-match communication, next steps |
First-Message Attachments
When a user sends their first message with a file attachment (e.g. "I need a knee replacement" + blood work PDF), the orchestrator identifies the procedure and processes the attachments in a single turn — it does not ask for records that were just uploaded. The procedure confirmation response is combined with the document analysis.
The Four Agents¶
1. Clinical Context Agent¶
The Clinical Context Agent processes medical documents and extracts structured clinical data. It is the most complex agent, implemented as a 4-node LangGraph workflow.
Purpose: Transform unstructured medical documents into structured FHIR R4 resources.
Model: Claude Sonnet 4.6 (requires high-accuracy clinical reasoning)
graph LR
A[extract_clinical_entities] --> B[map_to_medical_codes]
B --> C[generate_fhir_resources]
C --> D[store_resources]
style A fill:#008B8B,color:#fff
style B fill:#008B8B,color:#fff
style C fill:#008B8B,color:#fff
style D fill:#008B8B,color:#fff
Node Details:
| Node | Input | Output | Fallback |
|---|---|---|---|
extract_clinical_entities |
Raw OCR text | Structured entities (conditions, labs, meds) | Regex-based extraction patterns |
map_to_medical_codes |
Extracted entities | ICD-10, CPT, LOINC codes | Lookup table mapping |
generate_fhir_resources |
Coded entities | FHIR R4 JSON resources | Template-based FHIR generation |
store_resources |
Validated FHIR | Database confirmation | Direct SQL insert |
State Schema:
class ClinicalContextState(TypedDict):
"""State passed between Clinical Context Agent nodes."""
document_id: str
tenant_id: str
patient_id: str
case_id: str
raw_text: str
extracted_entities: dict # From node 1
medical_codes: dict # From node 2
fhir_resources: list[dict] # From node 3
store_confirmation: dict # From node 4
errors: list[str]
fallback_used: bool
Comorbidity Detection
Comorbidity detection is rule-based, not LLM-based. The system maintains a lookup table of common comorbidity pairs (e.g., diabetes + hypertension, obesity + sleep apnea) and flags them deterministically. This costs $0 per case.
2. Intake Agent¶
The Intake Agent conducts conversational intake to gather patient preferences, travel constraints, and treatment requirements.
Purpose: Collect structured preferences through natural conversation.
Model: Claude Haiku 4.5 (conversational, low-cost)
graph LR
A[classify_message] --> B[collect_preferences]
B --> C[suggest_options]
C --> D[update_case]
style A fill:#008B8B,color:#fff
style B fill:#008B8B,color:#fff
style C fill:#008B8B,color:#fff
style D fill:#008B8B,color:#fff
Node Details:
| Node | Input | Output | Fallback |
|---|---|---|---|
classify_message |
Patient message | Intent classification | Keyword matching |
collect_preferences |
Classified intent | Structured preference data | Form-based collection |
suggest_options |
Current preferences | Contextual suggestions | Static suggestion list |
update_case |
Confirmed preferences | Updated case record | Direct DB update |
Collected Preferences:
class PatientPreferences(BaseModel):
"""Preferences collected by the Intake Agent."""
budget_range_usd: Optional[tuple[int, int]]
preferred_countries: list[str] # ISO 3166-1 alpha-3
excluded_countries: list[str]
preferred_languages: list[str]
travel_date_range: Optional[tuple[date, date]]
companion_count: int = 0
dietary_restrictions: list[str]
accessibility_needs: list[str]
insurance_provider: Optional[str]
previous_medical_travel: bool = False
priority: str = "balanced" # "cost", "quality", "speed", "balanced"
State Schema:
class IntakeState(TypedDict):
"""State passed between Intake Agent nodes."""
case_id: str
tenant_id: str
patient_id: str
message: str
intent: str # From node 1
current_preferences: dict # Existing preferences
new_preferences: dict # From node 2
suggestions: list[str] # From node 3
update_confirmation: dict # From node 4
conversation_history: list[dict]
missing_fields: list[str]
errors: list[str]
3. Match Agent¶
The Match Agent orchestrates the provider matching workflow, combining graph traversal, semantic search, and weighted scoring.
Purpose: Find and rank the best providers and doctors for a patient's case.
Model: Claude Haiku 4.5 (orchestration) + deterministic scoring
graph LR
A[analyze_requirements] --> B[gather_requirements]
B --> C[execute_scoring]
C --> D[rerank_and_explain]
style A fill:#FF7F50,color:#fff
style B fill:#FF7F50,color:#fff
style C fill:#FF7F50,color:#fff
style D fill:#FF7F50,color:#fff
Node Details:
| Node | Input | Output | Fallback |
|---|---|---|---|
analyze_requirements |
Case data, FHIR resources | Structured matching criteria | Rule-based criteria extraction |
gather_requirements |
Matching criteria | Provider candidates from Neo4j + Qdrant | Direct Neo4j query |
execute_scoring |
Candidates + criteria | Scored and ranked results | Weighted rules scoring |
rerank_and_explain |
Scored results | Final ranking with explanations | Template-based explanations |
State Schema:
class MatchState(TypedDict):
"""State passed between Match Agent nodes."""
case_id: str
tenant_id: str
patient_id: str
clinical_data: dict # FHIR resources
patient_preferences: dict
matching_criteria: dict # From node 1
candidates: list[dict] # From node 2
scored_results: list[dict] # From node 3
final_results: list[dict] # From node 4
strategy_used: str
errors: list[str]
fallback_used: bool
4. Explanation Agent¶
The Explanation Agent generates natural-language explanations of matching results, tailored to the patient's locale and language.
Purpose: Make AI matching decisions transparent and understandable.
Model: Claude Haiku 4.5 (natural language generation)
Capabilities:
- Generates per-provider explanations (why this provider was recommended)
- Generates per-dimension explanations (why the clinical score is X)
- Adapts language to patient's
preferred_language - Adapts complexity to patient's indicated health literacy level
- Highlights strengths and potential concerns for each match
class ExplanationOutput(BaseModel):
"""Output from the Explanation Agent."""
provider_id: str
summary: str # 2-3 sentence overview
strengths: list[str] # Top 3 strengths
considerations: list[str] # Things to be aware of
dimension_explanations: dict[str, str] # Per-scoring-dimension
confidence_note: Optional[str] # If data completeness is low
locale: str # Language code used
Locale-Aware Explanations
The Explanation Agent detects the patient's preferred language from their profile and generates explanations in that language. For POC, English, Hindi, Arabic, Turkish, and Thai are supported.
Deterministic Fallbacks¶
Every agent node has a fallback implementation that runs without LLM calls:
async def extract_clinical_entities(state: ClinicalContextState) -> ClinicalContextState:
"""Extract clinical entities from document text."""
try:
# Primary: LLM-based extraction
result = await llm_extract(state["raw_text"])
state["extracted_entities"] = result
except (LLMError, TimeoutError, ValidationError) as e:
# Fallback: Regex + lookup table extraction
logger.warning(f"LLM extraction failed, using fallback: {e}")
result = regex_extract(state["raw_text"])
state["extracted_entities"] = result
state["fallback_used"] = True
state["errors"].append(f"Fallback used for extraction: {str(e)}")
return state
| Agent | Primary Path | Fallback Path | Fallback Quality |
|---|---|---|---|
| Clinical Context | Claude Sonnet extraction | Regex + lookup tables | ~70% of LLM accuracy |
| Intake | Claude Haiku conversation | Form-based collection | Functional but rigid |
| Match | LLM-enhanced scoring | Weighted rules only | ~90% of LLM accuracy |
| Explanation | Claude Haiku generation | Template-based text | Functional but generic |
MCP Server¶
Curaway exposes an MCP (Model Context Protocol) server with 6 tools for external AI assistants to interact with the platform:
| Tool | Description | Parameters |
|---|---|---|
search_patients |
Find patients by name, email, or ID | query, tenant_id |
get_patient_clinical_summary |
Get FHIR-based clinical summary | patient_id, tenant_id |
search_providers |
Search providers by specialty, location, accreditation | criteria, tenant_id |
run_match |
Execute matching for a case | case_id, tenant_id |
get_match_explanation |
Get explanation for a match result | match_id, tenant_id |
check_consent |
Verify patient consent status | patient_id, consent_type, tenant_id |
# MCP tool registration
@mcp_server.tool("search_providers")
async def search_providers(criteria: ProviderSearchCriteria, tenant_id: str):
"""Search for healthcare providers matching the given criteria."""
results = await provider_service.search(
tenant_id=tenant_id,
specialty=criteria.specialty,
country=criteria.country,
accreditation=criteria.accreditation,
max_results=criteria.max_results or 10,
)
return [provider.to_mcp_response() for provider in results]
Feature Flags¶
Agent behavior is controlled by Flagsmith feature flags:
| Flag | Default | Description |
|---|---|---|
agent_enhanced_matching |
false |
Use Match Agent instead of pure deterministic matching |
agent_explanations_enabled |
true |
Generate LLM explanations (vs. template-based) |
clinical_context_agent_enabled |
true |
Use LangGraph clinical extraction pipeline |
intake_agent_conversational |
true |
Conversational intake vs. form-based |
mcp_server_enabled |
false |
Expose MCP tools externally |
Observability¶
Events Table¶
Every agent action is logged to the events table:
await log_event(
tenant_id=tenant_id,
event_type="agent.clinical_context.extraction_complete",
case_id=case_id,
payload={
"document_id": doc_id,
"entities_found": len(entities),
"fallback_used": False,
"duration_ms": elapsed,
}
)
Langfuse Traces¶
Each agent invocation creates a Langfuse trace with:
- Trace: Full agent execution (e.g.,
clinical_context_agent) - Spans: Individual node executions (e.g.,
extract_clinical_entities) - Generations: LLM calls with input/output tokens and cost
- Scores: Quality metrics (extraction accuracy, explanation helpfulness)
graph TD
T[Trace: clinical_context_agent] --> S1[Span: extract_clinical_entities]
T --> S2[Span: map_to_medical_codes]
T --> S3[Span: generate_fhir_resources]
T --> S4[Span: store_resources]
S1 --> G1[Generation: claude-sonnet-4.6]
S2 --> G2[Generation: claude-haiku-4.5]
S3 --> G3[Generation: claude-haiku-4.5]
style T fill:#008B8B,color:#fff
style S1 fill:#4A90D9,color:#fff
style S2 fill:#4A90D9,color:#fff
style S3 fill:#4A90D9,color:#fff
style S4 fill:#4A90D9,color:#fff
style G1 fill:#FF7F50,color:#fff
style G2 fill:#FF7F50,color:#fff
style G3 fill:#FF7F50,color:#fff
Model Selection¶
| Agent / Task | Model | Rationale |
|---|---|---|
| Clinical Context Agent (extraction) | Claude Sonnet 4.6 | Highest accuracy needed for medical data |
| Clinical Context Agent (coding) | Claude Haiku 4.5 | Lookup-heavy, lower complexity |
| Intake Agent | Claude Haiku 4.5 | Conversational, high-volume |
| Match Agent (orchestration) | Claude Haiku 4.5 | Mostly deterministic scoring |
| Match Agent (reranking) | Claude Sonnet 4.6 | Complex multi-factor reasoning |
| Explanation Agent | Claude Haiku 4.5 | Natural language generation |
| MCP Tools | Claude Haiku 4.5 | External tool responses |