Skip to content

01 — System Architecture

Architectural Pattern: Modular Monolith

One FastAPI process, one Railway container. Each domain has isolated routers, services, schemas, models with no cross-domain internal imports. Enables clean microservices extraction post-seed.

Decision rationale: MVP velocity, not budget. Microservices double debugging surface area and coordination overhead with AI-assisted development.

Technology Stack

Layer Technology Role Cost/mo
Backend FastAPI (Python 3.12+) API gateway + service layer $0 (on Railway)
Orchestration LangGraph Multi-agent workflow engine $0 (open source)
Toolkit LangChain Tool wrappers for LLM/DB/API $0 (open source)
Frontend Vite + React + TypeScript Patient/provider/admin UIs $0 (on Vercel)
Primary DB Railway PostgreSQL FHIR, tenancy, events, audit Included in Railway Pro
Graph DB Neo4j Aura Clinical knowledge graph $0 (free, 200K nodes)
Vector DB Qdrant Cloud Medical embeddings $0 (free, 1GB)
Cache Upstash Redis Application caching $0 (free, 10K/day)
Event Bus Upstash QStash Async events + scheduling $0 (free, 500/day)
Auth/IAM Clerk Auth, RBAC, multi-tenant orgs $0 (free, 10K MAU)
Feature Flags Flagsmith Toggles + A/B assignment $0 (free, 50K/mo)
LLM (reasoning) Claude API Agent reasoning backbone ~$30
LLM (routing) OpenAI API Cost-efficient task routing ~$10
LLM Tracing Langfuse Cloud Traces, cost, latency, prompts $0 (free tier)
Embeddings Voyage AI Medical document embeddings $0 (free tier)
File Storage Cloudflare R2 Documents, X-rays, PDFs $0 (free, 10GB)
OCR (primary) PyMuPDF Local PDF text extraction $0 (bundled)
OCR (fallback) Unstructured.io Complex layout extraction $0 (free, 1K pages)
Email Resend Transactional email $0 (free, 3K/mo)
Hosting (API) Railway Pro FastAPI container $20
Hosting (web) Vercel React SPA edge deployment $0 (free tier)
Analytics PostHog Cloud Frontend user behavior $0 (free, 1M events)
BI Metabase OSS Dashboards + reporting $0 (on Railway)
Infra monitoring Grafana Cloud Container metrics $0 (free tier)

Total: 25 services, 21 on free tiers. ~$60/mo estimated.

LLM Tiered Routing

Task Tier Models Cost/MTok Usage
Simple (80% of calls) GPT-4o mini / Claude Haiku 4.5 $0.15–$1.00 in, $0.60–$5.00 out Classification, intake parsing, ICD extraction
Complex (20% of calls) Claude Sonnet 4.6 / GPT-4.1 $2.00–$3.00 in, $8.00–$15.00 out Clinical analysis, comorbidity reasoning, re-ranking
  • Per patient journey (6 agent calls): $0.07–$0.50
  • Target average: <$0.15 with tiered routing
  • Model selection: config/model_registry.yaml — swap models without code changes

Domain Boundaries

Domain Files External Deps Cross-Domain
Patients routers/patients, services/patient_service, models/patient, schemas/patient Encryption service None
Providers routers/providers, services/provider_service, models/provider None None
Consent routers/consent, services/consent_service, models/consent None None
FHIR/Clinical routers/fhir, services/fhir_service, models/fhir_resource fhir.resources lib None
Documents routers/documents, services/document_service, models/document Cloudflare R2 None
Matching routers/match, services/match_service, services/matching_engine None Reads: patient, fhir, provider
Agents agents/clinical_context, agents/intake, agents/match, agents/explanation Claude API, Langfuse Reads: patient, fhir, provider. Writes: fhir, events
GDPR services/data_subject_handler None Cascades across all

Rule: Matching and Agents are orchestration points — they read from other domains but those domains never import from matching/agents.

System Layers

┌─────────────────────────────────────────────────────────────┐
│  INPUT LAYER                                                 │
│  Vite/React (app.curaway.ai) · Document Upload (R2 presign) │
└───────────────────────────┬─────────────────────────────────┘
                            │ HTTPS
┌───────────────────────────▼─────────────────────────────────┐
│  GATEWAY LAYER                                               │
│  FastAPI (services.curaway.ai) · Clerk Auth · Tenant Middleware │
│  Correlation ID · CORS · Rate Limiting (post-MVP)            │
└───────────────────────────┬─────────────────────────────────┘
┌───────────────────────────▼─────────────────────────────────┐
│  INTELLIGENCE LAYER                                          │
│  LangGraph Agents (Intake, Clinical, Match, Explanation)     │
│  Orchestrator · Langfuse Prompts · Guardrails (YAML)         │
│  Tiered LLM Routing (Claude + OpenAI)                        │
└───────────────────────────┬─────────────────────────────────┘
┌───────────────────────────▼─────────────────────────────────┐
│  DATA LAYER                                                  │
│  Railway PostgreSQL (FHIR JSONB + events + tenancy)          │
│  Neo4j Aura (knowledge graph) · Qdrant Cloud (embeddings)   │
│  Upstash Redis (cache)                                       │
└───────────────────────────┬─────────────────────────────────┘
┌───────────────────────────▼─────────────────────────────────┐
│  INTEGRATION LAYER                                           │
│  QStash (async bus + cron) · Resend (email) · R2 (files)    │
│  MCP Server (external AI) · Flagsmith (feature flags)        │
└─────────────────────────────────────────────────────────────┘

Non-Negotiable Engineering Rules

  1. Configurability first — model/API swaps via YAML, never code changes
  2. Feature flags everywhere — Flagsmith per-tenant targeting, kill switches
  3. Multi-tenancy from day 1tenant_id on every table, PostgreSQL RLS, Clerk org isolation
  4. Production-evolvable — clean extraction to microservices post-seed
  5. Neo4j graph from day 1 — clinical knowledge graph is core data asset
  6. GDPR from day 1 — consent-gated processing, audit logging, DSAR handler
  7. Structured error codes — domain-prefixed (AGENT_, FHIR_, MATCH_*)
  8. API versioning/api/v1/ with semantic versioning
  9. Both Claude + OpenAI — tiered routing via model_registry.yaml
  10. Pluggable matching — strategy pattern with shadow mode