01 — System Architecture¶

Architectural Pattern: Modular Monolith¶

One FastAPI process, one Railway container. Each domain has isolated routers, services, schemas, models with no cross-domain internal imports. Enables clean microservices extraction post-seed.

Decision rationale: MVP velocity, not budget. Microservices double debugging surface area and coordination overhead with AI-assisted development.

Technology Stack¶

Layer	Technology	Role	Cost/mo
Backend	FastAPI (Python 3.12+)	API gateway + service layer	$0 (on Railway)
Orchestration	LangGraph	Multi-agent workflow engine	$0 (open source)
Toolkit	LangChain	Tool wrappers for LLM/DB/API	$0 (open source)
Frontend	Vite + React + TypeScript	Patient/provider/admin UIs	$0 (on Vercel)
Primary DB	Railway PostgreSQL	FHIR, tenancy, events, audit	Included in Railway Pro
Graph DB	Neo4j Aura	Clinical knowledge graph	$0 (free, 200K nodes)
Vector DB	Qdrant Cloud	Medical embeddings	$0 (free, 1GB)
Cache	Upstash Redis	Application caching	$0 (free, 10K/day)
Event Bus	Upstash QStash	Async events + scheduling	$0 (free, 500/day)
Auth/IAM	Clerk	Auth, RBAC, multi-tenant orgs	$0 (free, 10K MAU)
Feature Flags	Flagsmith	Toggles + A/B assignment	$0 (free, 50K/mo)
LLM (reasoning)	Claude API	Agent reasoning backbone	~$30
LLM (routing)	OpenAI API	Cost-efficient task routing	~$10
LLM Tracing	Langfuse Cloud	Traces, cost, latency, prompts	$0 (free tier)
Embeddings	Voyage AI	Medical document embeddings	$0 (free tier)
File Storage	Cloudflare R2	Documents, X-rays, PDFs	$0 (free, 10GB)
OCR (primary)	PyMuPDF	Local PDF text extraction	$0 (bundled)
OCR (fallback)	Unstructured.io	Complex layout extraction	$0 (free, 1K pages)
Email	Resend	Transactional email	$0 (free, 3K/mo)
Hosting (API)	Railway Pro	FastAPI container	$20
Hosting (web)	Vercel	React SPA edge deployment	$0 (free tier)
Analytics	PostHog Cloud	Frontend user behavior	$0 (free, 1M events)
BI	Metabase OSS	Dashboards + reporting	$0 (on Railway)
Infra monitoring	Grafana Cloud	Container metrics	$0 (free tier)

Total: 25 services, 21 on free tiers. ~$60/mo estimated.

LLM Tiered Routing¶

Task Tier	Models	Cost/MTok	Usage
Simple (80% of calls)	GPT-4o mini / Claude Haiku 4.5	$0.15–$1.00 in, $0.60–$5.00 out	Classification, intake parsing, ICD extraction
Complex (20% of calls)	Claude Sonnet 4.6 / GPT-4.1	$2.00–$3.00 in, $8.00–$15.00 out	Clinical analysis, comorbidity reasoning, re-ranking

Per patient journey (6 agent calls): $0.07–$0.50
Target average: <$0.15 with tiered routing
Model selection: config/model_registry.yaml — swap models without code changes

Domain Boundaries¶

Domain	Files	External Deps	Cross-Domain
Patients	`routers/patients`, `services/patient_service`, `models/patient`, `schemas/patient`	Encryption service	None
Providers	`routers/providers`, `services/provider_service`, `models/provider`	None	None
Consent	`routers/consent`, `services/consent_service`, `models/consent`	None	None
FHIR/Clinical	`routers/fhir`, `services/fhir_service`, `models/fhir_resource`	fhir.resources lib	None
Documents	`routers/documents`, `services/document_service`, `models/document`	Cloudflare R2	None
Matching	`routers/match`, `services/match_service`, `services/matching_engine`	None	Reads: patient, fhir, provider
Agents	`agents/clinical_context`, `agents/intake`, `agents/match`, `agents/explanation`	Claude API, Langfuse	Reads: patient, fhir, provider. Writes: fhir, events
GDPR	`services/data_subject_handler`	None	Cascades across all

Rule: Matching and Agents are orchestration points — they read from other domains but those domains never import from matching/agents.

System Layers¶

┌─────────────────────────────────────────────────────────────┐
│  INPUT LAYER                                                 │
│  Vite/React (app.curaway.ai) · Document Upload (R2 presign) │
└───────────────────────────┬─────────────────────────────────┘
                            │ HTTPS
┌───────────────────────────▼─────────────────────────────────┐
│  GATEWAY LAYER                                               │
│  FastAPI (services.curaway.ai) · Clerk Auth · Tenant Middleware │
│  Correlation ID · CORS · Rate Limiting (post-MVP)            │
└───────────────────────────┬─────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────┐
│  INTELLIGENCE LAYER                                          │
│  LangGraph Agents (Intake, Clinical, Match, Explanation)     │
│  Orchestrator · Langfuse Prompts · Guardrails (YAML)         │
│  Tiered LLM Routing (Claude + OpenAI)                        │
└───────────────────────────┬─────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────┐
│  DATA LAYER                                                  │
│  Railway PostgreSQL (FHIR JSONB + events + tenancy)          │
│  Neo4j Aura (knowledge graph) · Qdrant Cloud (embeddings)   │
│  Upstash Redis (cache)                                       │
└───────────────────────────┬─────────────────────────────────┘
                            │
┌───────────────────────────▼─────────────────────────────────┐
│  INTEGRATION LAYER                                           │
│  QStash (async bus + cron) · Resend (email) · R2 (files)    │
│  MCP Server (external AI) · Flagsmith (feature flags)        │
└─────────────────────────────────────────────────────────────┘

Non-Negotiable Engineering Rules¶

Configurability first — model/API swaps via YAML, never code changes
Feature flags everywhere — Flagsmith per-tenant targeting, kill switches
Multi-tenancy from day 1 — tenant_id on every table, PostgreSQL RLS, Clerk org isolation
Production-evolvable — clean extraction to microservices post-seed
Neo4j graph from day 1 — clinical knowledge graph is core data asset
GDPR from day 1 — consent-gated processing, audit logging, DSAR handler
Structured error codes — domain-prefixed (AGENT_, FHIR_, MATCH_*)
API versioning — /api/v1/ with semantic versioning
Both Claude + OpenAI — tiered routing via model_registry.yaml
Pluggable matching — strategy pattern with shadow mode