01 — System Architecture¶
Architectural Pattern: Modular Monolith¶
One FastAPI process, one Railway container. Each domain has isolated routers, services, schemas, models with no cross-domain internal imports. Enables clean microservices extraction post-seed.
Decision rationale: MVP velocity, not budget. Microservices double debugging surface area and coordination overhead with AI-assisted development.
Technology Stack¶
| Layer | Technology | Role | Cost/mo |
|---|---|---|---|
| Backend | FastAPI (Python 3.12+) | API gateway + service layer | $0 (on Railway) |
| Orchestration | LangGraph | Multi-agent workflow engine | $0 (open source) |
| Toolkit | LangChain | Tool wrappers for LLM/DB/API | $0 (open source) |
| Frontend | Vite + React + TypeScript | Patient/provider/admin UIs | $0 (on Vercel) |
| Primary DB | Railway PostgreSQL | FHIR, tenancy, events, audit | Included in Railway Pro |
| Graph DB | Neo4j Aura | Clinical knowledge graph | $0 (free, 200K nodes) |
| Vector DB | Qdrant Cloud | Medical embeddings | $0 (free, 1GB) |
| Cache | Upstash Redis | Application caching | $0 (free, 10K/day) |
| Event Bus | Upstash QStash | Async events + scheduling | $0 (free, 500/day) |
| Auth/IAM | Clerk | Auth, RBAC, multi-tenant orgs | $0 (free, 10K MAU) |
| Feature Flags | Flagsmith | Toggles + A/B assignment | $0 (free, 50K/mo) |
| LLM (reasoning) | Claude API | Agent reasoning backbone | ~$30 |
| LLM (routing) | OpenAI API | Cost-efficient task routing | ~$10 |
| LLM Tracing | Langfuse Cloud | Traces, cost, latency, prompts | $0 (free tier) |
| Embeddings | Voyage AI | Medical document embeddings | $0 (free tier) |
| File Storage | Cloudflare R2 | Documents, X-rays, PDFs | $0 (free, 10GB) |
| OCR (primary) | PyMuPDF | Local PDF text extraction | $0 (bundled) |
| OCR (fallback) | Unstructured.io | Complex layout extraction | $0 (free, 1K pages) |
| Resend | Transactional email | $0 (free, 3K/mo) | |
| Hosting (API) | Railway Pro | FastAPI container | $20 |
| Hosting (web) | Vercel | React SPA edge deployment | $0 (free tier) |
| Analytics | PostHog Cloud | Frontend user behavior | $0 (free, 1M events) |
| BI | Metabase OSS | Dashboards + reporting | $0 (on Railway) |
| Infra monitoring | Grafana Cloud | Container metrics | $0 (free tier) |
Total: 25 services, 21 on free tiers. ~$60/mo estimated.
LLM Tiered Routing¶
| Task Tier | Models | Cost/MTok | Usage |
|---|---|---|---|
| Simple (80% of calls) | GPT-4o mini / Claude Haiku 4.5 | $0.15–$1.00 in, $0.60–$5.00 out | Classification, intake parsing, ICD extraction |
| Complex (20% of calls) | Claude Sonnet 4.6 / GPT-4.1 | $2.00–$3.00 in, $8.00–$15.00 out | Clinical analysis, comorbidity reasoning, re-ranking |
- Per patient journey (6 agent calls): $0.07–$0.50
- Target average: <$0.15 with tiered routing
- Model selection:
config/model_registry.yaml— swap models without code changes
Domain Boundaries¶
| Domain | Files | External Deps | Cross-Domain |
|---|---|---|---|
| Patients | routers/patients, services/patient_service, models/patient, schemas/patient |
Encryption service | None |
| Providers | routers/providers, services/provider_service, models/provider |
None | None |
| Consent | routers/consent, services/consent_service, models/consent |
None | None |
| FHIR/Clinical | routers/fhir, services/fhir_service, models/fhir_resource |
fhir.resources lib | None |
| Documents | routers/documents, services/document_service, models/document |
Cloudflare R2 | None |
| Matching | routers/match, services/match_service, services/matching_engine |
None | Reads: patient, fhir, provider |
| Agents | agents/clinical_context, agents/intake, agents/match, agents/explanation |
Claude API, Langfuse | Reads: patient, fhir, provider. Writes: fhir, events |
| GDPR | services/data_subject_handler |
None | Cascades across all |
Rule: Matching and Agents are orchestration points — they read from other domains but those domains never import from matching/agents.
System Layers¶
┌─────────────────────────────────────────────────────────────┐
│ INPUT LAYER │
│ Vite/React (app.curaway.ai) · Document Upload (R2 presign) │
└───────────────────────────┬─────────────────────────────────┘
│ HTTPS
┌───────────────────────────▼─────────────────────────────────┐
│ GATEWAY LAYER │
│ FastAPI (services.curaway.ai) · Clerk Auth · Tenant Middleware │
│ Correlation ID · CORS · Rate Limiting (post-MVP) │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ INTELLIGENCE LAYER │
│ LangGraph Agents (Intake, Clinical, Match, Explanation) │
│ Orchestrator · Langfuse Prompts · Guardrails (YAML) │
│ Tiered LLM Routing (Claude + OpenAI) │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ DATA LAYER │
│ Railway PostgreSQL (FHIR JSONB + events + tenancy) │
│ Neo4j Aura (knowledge graph) · Qdrant Cloud (embeddings) │
│ Upstash Redis (cache) │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ INTEGRATION LAYER │
│ QStash (async bus + cron) · Resend (email) · R2 (files) │
│ MCP Server (external AI) · Flagsmith (feature flags) │
└─────────────────────────────────────────────────────────────┘
Non-Negotiable Engineering Rules¶
- Configurability first — model/API swaps via YAML, never code changes
- Feature flags everywhere — Flagsmith per-tenant targeting, kill switches
- Multi-tenancy from day 1 —
tenant_idon every table, PostgreSQL RLS, Clerk org isolation - Production-evolvable — clean extraction to microservices post-seed
- Neo4j graph from day 1 — clinical knowledge graph is core data asset
- GDPR from day 1 — consent-gated processing, audit logging, DSAR handler
- Structured error codes — domain-prefixed (AGENT_, FHIR_, MATCH_*)
- API versioning —
/api/v1/with semantic versioning - Both Claude + OpenAI — tiered routing via model_registry.yaml
- Pluggable matching — strategy pattern with shadow mode