00 — Overview, Scope & MVP Definition¶

MVP Objective¶

Single flagship MVP: Curaway Clinical Match Engine v1

Upload medical report → AI clinical understanding → Structured FHIR record → Provider match → Explainable reasoning

This flow validates clinical intelligence, data modeling, multi-agent orchestration, conversational interaction, and matching intelligence in one cohesive demo.

Key Constraints¶

Constraint	Value	Implication
Budget	$1,000 total (~$312 spend, ~$688 reserve)	Free tiers on 21/25 services
Team	Non-coder CPO/CTO + AI dev tools	All code via Claude Code + Cursor
Timeline	16 build sessions to demo-ready	Modular monolith, not microservices
Infra	Railway Pro + free-tier services	No GPU, no self-hosted models
Compliance	GDPR from day 1	Consent-gated everything, audit logging

In Scope (MVP)¶

Patient registration + GDPR consent collection
Conversational intake via Intake Agent
Document upload with async OCR/parsing pipeline
AI clinical context extraction (Clinical Context Agent)
Real-time EHR assembly from chat + document events
Four-stage provider matching (Qdrant → Neo4j → PostgreSQL → LLM)
Explainable match reasoning in patient locale
Provider storefront with doctor profiles
Agent orchestrator with unified /chat endpoint
Full observability (Langfuse, PostHog, events table)

Out of Scope (Post-MVP)¶

Video consultations (schema built, Daily.co deferred)
Payment processing / booking confirmation
Mobile native apps (API-ready, React Native post-seed)
ML matching v2 (learning-to-rank, needs outcome data)
MedGemma integration (shadow eval post-seed)
Provider webhook delivery (schema built, delivery stubbed)
SMS notifications (Twilio SDK installed, not wired)

Demo Scenario: Aisha's TKR Journey¶

Attribute	Value
Patient	Aisha (demo persona)
Location	UAE (Dubai)
Language	Arabic (preferred), English (secondary)
Condition	Primary osteoarthritis, right knee (ICD-10: M17.11)
Procedure	Total Knee Replacement (CPT: 27447)
Severity	Kellgren-Lawrence Grade 4
Comorbidities	Type 2 diabetes (E11), mild hypertension (I10)
Preferences	Halal dietary, Arabic-speaking staff, female nursing preference
Budget	$6,000–$12,000 USD
Timeline	Flexible, within 3 months
Tenant	`tenant-apollo-001`

Eight Workflow Phases¶

Patient registers via Clerk. System collects GDPR-compliant, purpose-specific consent. Without consent, all downstream processing blocked. Consent is immutable — new versions created, never updated.

Endpoint: POST /api/v1/patients
Consent purposes: clinical_data_processing, cross_border_transfer, provider_sharing, analytics
Events emitted: patient_registered, consent_granted
Routes to: Intake Agent

Phase 2: Conversational Intake¶

Intake Agent guides Aisha through information collection via natural conversation. Records-first principle: extract from documents first, ask questions only for gaps.

Endpoint: POST /api/v1/patients/{id}/chat (routed to Intake Agent)
Agent nodes: classify_intent → collect_information → suggest_actions → update_progress
State: Events table (not in-memory) — enables conversation resumption
Completion: intake_progress float (0.0–1.0), updated on every interaction
Model: Claude Haiku 4.5 (~$0.01 per intake conversation)
Fallback: Standard form-based intake via REST endpoints

Phase 3: Document Upload & Processing¶

Async, non-blocking pipeline:

Client requests presigned upload URL: POST /api/v1/uploads/presign
Client uploads directly to R2 (bypasses API server)
Client confirms: POST /api/v1/uploads/confirm with storage_key
API creates document_reference in PostgreSQL
QStash dispatches async OCR job (non-blocking)
PyMuPDF (primary) → Unstructured.io (fallback) → Claude vision (fallback)
Extracted text auto-chains into Clinical Context Agent
SSE pushes status updates to frontend in real-time

Critical: Document parsing is non-blocking. Conversation continues while documents process.

Phase 4: Clinical Context Extraction¶

The demo showstopper. Raw medical report text → validated FHIR R4 resources with ICD-10/SNOMED codes. Auto-triggers on document parse completion.

Endpoint: POST /api/v1/patients/{id}/analyze-report
LangGraph nodes: extract_clinical_entities → map_to_medical_codes → generate_fhir_resources → store_resources
State: { patient_id, tenant_id, raw_text, report_type, extracted_entities[], coded_entities[], fhir_resources[], stored_resource_ids[], errors[] }
Model: Claude Haiku 4.5 (~$0.01 per report)
Output: FHIR R4 Condition (M17.11), Procedure (27447), Observation (labs), AllergyIntolerance
Validation: Every FHIR resource validated against HL7 R4 schema via fhir.resources
Fallback: 202 Accepted, raw text stored, extraction queued for QStash retry

Phase 5: EHR Assembly (Real-Time)¶

Clinical data from multiple sources (chat, documents, agents) assembles progressively. Every new data point triggers an event that updates the structured health record. Not a batch process. See 05-ehr-builder.md.

Phase 6: Provider Matching¶

Four-stage pipeline when sufficient clinical data exists:

Stage 0 — Qdrant: Semantic discovery (always-on, never skipped)
Stage 1 — Neo4j: Hard constraint filtering (pass/fail)
Stage 2 — PostgreSQL: Weighted scoring (5 domains)
Stage 3 — LLM: Re-ranking + explanation generation
Feature flag: agent_enhanced_matching — when disabled, stages 0+3 skipped

Phase 7: Explainable Reasoning¶

Explanation Agent generates match reasoning in patient locale (Arabic for Aisha). Fallback: template-based strings.

Phase 8: Results & Iteration¶

Match results in conversation thread with provider cards, confidence scores, cost estimates, AI reasoning. Patient can ask follow-ups, re-match with different priorities, or shortlist.

User Roles¶

Role	Description	Auth
Patient	Interacts via conversation UI, uploads docs, views matches	Clerk auth, patient-scoped, consent-gated
Provider Admin	Reviews cases, manages profile, views analytics	Clerk org auth, tenant-scoped, RBAC
Super Admin	Cross-tenant visibility, system config, audit logs	Clerk super admin, all tenants, audit-logged
External AI	Claude Desktop, partner agents via MCP Server	MCP auth token, tenant-isolated, consent-checked