Skip to content

Agent System

Overview

Curaway's AI agent system is the brain of the platform. It orchestrates multi-step workflows that combine clinical understanding, patient interaction, provider matching, and natural-language explanation. The system is built on three pillars:

  • LangGraph -- Orchestration framework for multi-node, stateful AI workflows
  • LangChain -- Tool wrappers that give agents access to databases, APIs, and external services
  • Langfuse -- Observability platform for tracing, prompt management, and cost tracking

Healthcare Safety Principle

Every agent has a deterministic fallback path. If an LLM call fails, times out, or returns invalid output, the system falls back to rule-based logic. Healthcare workflows cannot be broken by LLM failure.


Architecture

Single Entry Point

All patient interactions flow through one API endpoint:

POST /api/v1/cases/{case_id}/chat
@router.post("/api/v1/cases/{case_id}/chat")
async def chat(
    case_id: UUID,
    request: ChatRequest,
    tenant_id: str = Header(alias="X-Tenant-ID"),
):
    """Single entry point for all case-related conversations."""
    orchestrator = CaseOrchestrator(case_id=case_id, tenant_id=tenant_id)
    response = await orchestrator.process(request.message)
    return ChatResponse(message=response.content, phase=response.phase)

Orchestrator

The orchestrator manages 8 workflow phases, routing messages to the appropriate sub-agent based on the current case state:

graph TD
    Entry[POST /chat] --> Orch[Orchestrator]
    Orch --> Phase1[1. Initial Contact]
    Orch --> Phase2[2. Document Upload]
    Orch --> Phase3[3. Clinical Review]
    Orch --> Phase4[4. Intake Collection]
    Orch --> Phase5[5. Matching]
    Orch --> Phase6[6. Results Presentation]
    Orch --> Phase7[7. Consultation Scheduling]
    Orch --> Phase8[8. Follow-up]

    Phase2 --> CCA[Clinical Context Agent]
    Phase4 --> IA[Intake Agent]
    Phase5 --> MA[Match Agent]
    Phase6 --> EA[Explanation Agent]

    style Orch fill:#008B8B,color:#fff
    style CCA fill:#FF7F50,color:#fff
    style IA fill:#FF7F50,color:#fff
    style MA fill:#FF7F50,color:#fff
    style EA fill:#FF7F50,color:#fff
Phase Agent Description
1. Initial Contact Orchestrator (direct) Greeting, case creation, basic info
2. Document Upload Clinical Context Agent Process uploaded medical documents
3. Clinical Review Clinical Context Agent Extract entities, generate FHIR resources
4. Intake Collection Intake Agent Gather preferences, constraints, requirements
5. Matching Match Agent Run provider matching algorithm
6. Results Presentation Explanation Agent Present matches with explanations
7. Consultation Scheduling Orchestrator (direct) Book video consults with matched doctors
8. Follow-up Orchestrator (direct) Post-match communication, next steps

First-Message Attachments

When a user sends their first message with a file attachment (e.g. "I need a knee replacement" + blood work PDF), the orchestrator identifies the procedure and processes the attachments in a single turn — it does not ask for records that were just uploaded. The procedure confirmation response is combined with the document analysis.


The Four Agents

1. Clinical Context Agent

The Clinical Context Agent processes medical documents and extracts structured clinical data. It is the most complex agent, implemented as a 4-node LangGraph workflow.

Purpose: Transform unstructured medical documents into structured FHIR R4 resources.

Model: Claude Sonnet 4.6 (requires high-accuracy clinical reasoning)

graph LR
    A[extract_clinical_entities] --> B[map_to_medical_codes]
    B --> C[generate_fhir_resources]
    C --> D[store_resources]

    style A fill:#008B8B,color:#fff
    style B fill:#008B8B,color:#fff
    style C fill:#008B8B,color:#fff
    style D fill:#008B8B,color:#fff

Node Details:

Node Input Output Fallback
extract_clinical_entities Raw OCR text Structured entities (conditions, labs, meds) Regex-based extraction patterns
map_to_medical_codes Extracted entities ICD-10, CPT, LOINC codes Lookup table mapping
generate_fhir_resources Coded entities FHIR R4 JSON resources Template-based FHIR generation
store_resources Validated FHIR Database confirmation Direct SQL insert

State Schema:

class ClinicalContextState(TypedDict):
    """State passed between Clinical Context Agent nodes."""
    document_id: str
    tenant_id: str
    patient_id: str
    case_id: str
    raw_text: str
    extracted_entities: dict          # From node 1
    medical_codes: dict               # From node 2
    fhir_resources: list[dict]        # From node 3
    store_confirmation: dict          # From node 4
    errors: list[str]
    fallback_used: bool

Comorbidity Detection

Comorbidity detection is rule-based, not LLM-based. The system maintains a lookup table of common comorbidity pairs (e.g., diabetes + hypertension, obesity + sleep apnea) and flags them deterministically. This costs $0 per case.


2. Intake Agent

The Intake Agent conducts conversational intake to gather patient preferences, travel constraints, and treatment requirements.

Purpose: Collect structured preferences through natural conversation.

Model: Claude Haiku 4.5 (conversational, low-cost)

graph LR
    A[classify_message] --> B[collect_preferences]
    B --> C[suggest_options]
    C --> D[update_case]

    style A fill:#008B8B,color:#fff
    style B fill:#008B8B,color:#fff
    style C fill:#008B8B,color:#fff
    style D fill:#008B8B,color:#fff

Node Details:

Node Input Output Fallback
classify_message Patient message Intent classification Keyword matching
collect_preferences Classified intent Structured preference data Form-based collection
suggest_options Current preferences Contextual suggestions Static suggestion list
update_case Confirmed preferences Updated case record Direct DB update

Collected Preferences:

class PatientPreferences(BaseModel):
    """Preferences collected by the Intake Agent."""
    budget_range_usd: Optional[tuple[int, int]]
    preferred_countries: list[str]           # ISO 3166-1 alpha-3
    excluded_countries: list[str]
    preferred_languages: list[str]
    travel_date_range: Optional[tuple[date, date]]
    companion_count: int = 0
    dietary_restrictions: list[str]
    accessibility_needs: list[str]
    insurance_provider: Optional[str]
    previous_medical_travel: bool = False
    priority: str = "balanced"               # "cost", "quality", "speed", "balanced"

State Schema:

class IntakeState(TypedDict):
    """State passed between Intake Agent nodes."""
    case_id: str
    tenant_id: str
    patient_id: str
    message: str
    intent: str                              # From node 1
    current_preferences: dict                # Existing preferences
    new_preferences: dict                    # From node 2
    suggestions: list[str]                   # From node 3
    update_confirmation: dict                # From node 4
    conversation_history: list[dict]
    missing_fields: list[str]
    errors: list[str]

3. Match Agent

The Match Agent orchestrates the provider matching workflow, combining graph traversal, semantic search, and weighted scoring.

Purpose: Find and rank the best providers and doctors for a patient's case.

Model: Claude Haiku 4.5 (orchestration) + deterministic scoring

graph LR
    A[analyze_requirements] --> B[gather_requirements]
    B --> C[execute_scoring]
    C --> D[rerank_and_explain]

    style A fill:#FF7F50,color:#fff
    style B fill:#FF7F50,color:#fff
    style C fill:#FF7F50,color:#fff
    style D fill:#FF7F50,color:#fff

Node Details:

Node Input Output Fallback
analyze_requirements Case data, FHIR resources Structured matching criteria Rule-based criteria extraction
gather_requirements Matching criteria Provider candidates from Neo4j + Qdrant Direct Neo4j query
execute_scoring Candidates + criteria Scored and ranked results Weighted rules scoring
rerank_and_explain Scored results Final ranking with explanations Template-based explanations

State Schema:

class MatchState(TypedDict):
    """State passed between Match Agent nodes."""
    case_id: str
    tenant_id: str
    patient_id: str
    clinical_data: dict                      # FHIR resources
    patient_preferences: dict
    matching_criteria: dict                  # From node 1
    candidates: list[dict]                   # From node 2
    scored_results: list[dict]               # From node 3
    final_results: list[dict]                # From node 4
    strategy_used: str
    errors: list[str]
    fallback_used: bool

4. Explanation Agent

The Explanation Agent generates natural-language explanations of matching results, tailored to the patient's locale and language.

Purpose: Make AI matching decisions transparent and understandable.

Model: Claude Haiku 4.5 (natural language generation)

Capabilities:

  • Generates per-provider explanations (why this provider was recommended)
  • Generates per-dimension explanations (why the clinical score is X)
  • Adapts language to patient's preferred_language
  • Adapts complexity to patient's indicated health literacy level
  • Highlights strengths and potential concerns for each match
class ExplanationOutput(BaseModel):
    """Output from the Explanation Agent."""
    provider_id: str
    summary: str                             # 2-3 sentence overview
    strengths: list[str]                     # Top 3 strengths
    considerations: list[str]                # Things to be aware of
    dimension_explanations: dict[str, str]   # Per-scoring-dimension
    confidence_note: Optional[str]           # If data completeness is low
    locale: str                              # Language code used

Locale-Aware Explanations

The Explanation Agent detects the patient's preferred language from their profile and generates explanations in that language. For POC, English, Hindi, Arabic, Turkish, and Thai are supported.


Deterministic Fallbacks

Every agent node has a fallback implementation that runs without LLM calls:

async def extract_clinical_entities(state: ClinicalContextState) -> ClinicalContextState:
    """Extract clinical entities from document text."""
    try:
        # Primary: LLM-based extraction
        result = await llm_extract(state["raw_text"])
        state["extracted_entities"] = result
    except (LLMError, TimeoutError, ValidationError) as e:
        # Fallback: Regex + lookup table extraction
        logger.warning(f"LLM extraction failed, using fallback: {e}")
        result = regex_extract(state["raw_text"])
        state["extracted_entities"] = result
        state["fallback_used"] = True
        state["errors"].append(f"Fallback used for extraction: {str(e)}")
    return state
Agent Primary Path Fallback Path Fallback Quality
Clinical Context Claude Sonnet extraction Regex + lookup tables ~70% of LLM accuracy
Intake Claude Haiku conversation Form-based collection Functional but rigid
Match LLM-enhanced scoring Weighted rules only ~90% of LLM accuracy
Explanation Claude Haiku generation Template-based text Functional but generic

MCP Server

Curaway exposes an MCP (Model Context Protocol) server with 6 tools for external AI assistants to interact with the platform:

Tool Description Parameters
search_patients Find patients by name, email, or ID query, tenant_id
get_patient_clinical_summary Get FHIR-based clinical summary patient_id, tenant_id
search_providers Search providers by specialty, location, accreditation criteria, tenant_id
run_match Execute matching for a case case_id, tenant_id
get_match_explanation Get explanation for a match result match_id, tenant_id
check_consent Verify patient consent status patient_id, consent_type, tenant_id
# MCP tool registration
@mcp_server.tool("search_providers")
async def search_providers(criteria: ProviderSearchCriteria, tenant_id: str):
    """Search for healthcare providers matching the given criteria."""
    results = await provider_service.search(
        tenant_id=tenant_id,
        specialty=criteria.specialty,
        country=criteria.country,
        accreditation=criteria.accreditation,
        max_results=criteria.max_results or 10,
    )
    return [provider.to_mcp_response() for provider in results]

Feature Flags

Agent behavior is controlled by Flagsmith feature flags:

Flag Default Description
agent_enhanced_matching false Use Match Agent instead of pure deterministic matching
agent_explanations_enabled true Generate LLM explanations (vs. template-based)
clinical_context_agent_enabled true Use LangGraph clinical extraction pipeline
intake_agent_conversational true Conversational intake vs. form-based
mcp_server_enabled false Expose MCP tools externally

Observability

Events Table

Every agent action is logged to the events table:

await log_event(
    tenant_id=tenant_id,
    event_type="agent.clinical_context.extraction_complete",
    case_id=case_id,
    payload={
        "document_id": doc_id,
        "entities_found": len(entities),
        "fallback_used": False,
        "duration_ms": elapsed,
    }
)

Langfuse Traces

Each agent invocation creates a Langfuse trace with:

  • Trace: Full agent execution (e.g., clinical_context_agent)
  • Spans: Individual node executions (e.g., extract_clinical_entities)
  • Generations: LLM calls with input/output tokens and cost
  • Scores: Quality metrics (extraction accuracy, explanation helpfulness)
graph TD
    T[Trace: clinical_context_agent] --> S1[Span: extract_clinical_entities]
    T --> S2[Span: map_to_medical_codes]
    T --> S3[Span: generate_fhir_resources]
    T --> S4[Span: store_resources]
    S1 --> G1[Generation: claude-sonnet-4.6]
    S2 --> G2[Generation: claude-haiku-4.5]
    S3 --> G3[Generation: claude-haiku-4.5]

    style T fill:#008B8B,color:#fff
    style S1 fill:#4A90D9,color:#fff
    style S2 fill:#4A90D9,color:#fff
    style S3 fill:#4A90D9,color:#fff
    style S4 fill:#4A90D9,color:#fff
    style G1 fill:#FF7F50,color:#fff
    style G2 fill:#FF7F50,color:#fff
    style G3 fill:#FF7F50,color:#fff

Model Selection

Agent / Task Model Rationale
Clinical Context Agent (extraction) Claude Sonnet 4.6 Highest accuracy needed for medical data
Clinical Context Agent (coding) Claude Haiku 4.5 Lookup-heavy, lower complexity
Intake Agent Claude Haiku 4.5 Conversational, high-volume
Match Agent (orchestration) Claude Haiku 4.5 Mostly deterministic scoring
Match Agent (reranking) Claude Sonnet 4.6 Complex multi-factor reasoning
Explanation Agent Claude Haiku 4.5 Natural language generation
MCP Tools Claude Haiku 4.5 External tool responses