Infrastructure¶
Curaway runs on a lean, pay-as-you-grow infrastructure stack optimized for a pre-revenue medical travel platform. Every service is either free-tier or minimal-cost, with clear upgrade paths when traffic demands it.
Architecture Overview¶
graph TB
subgraph "Client Layer"
FE[Next.js Frontend<br/>Vercel]
end
subgraph "Compute"
BE[FastAPI Backend<br/>Railway Pro]
end
subgraph "Data Stores"
PG[(PostgreSQL<br/>Railway Internal)]
N4J[(Neo4j Aura<br/>Free Tier)]
QD[(Qdrant Cloud<br/>Free Tier)]
RD[(Upstash Redis<br/>Free Tier)]
end
subgraph "Object Storage"
R2[Cloudflare R2<br/>10GB Free]
end
subgraph "Async / Cron"
QS[Upstash QStash<br/>Free Tier]
end
subgraph "External Services"
CL[Clerk Auth]
RS[Resend Email]
LF[Langfuse]
PH[PostHog]
FS[Flagsmith]
end
FE -->|HTTPS| BE
BE --> PG
BE --> N4J
BE --> QD
BE --> RD
BE --> R2
QS -->|Webhooks| BE
BE --> CL
BE --> RS
BE --> LF
BE --> PH
BE --> FS
Compute: Railway Pro¶
| Setting | Value |
|---|---|
| Plan | Pro ($20/month) |
| Runtime | Docker container (Python 3.11, FastAPI) |
| Deploy trigger | Push to main branch |
| Region | US West |
| Health check | GET /ready |
| Restart policy | Auto-restart on crash |
| Sleep | Never (Pro plan) |
Deployment Pipeline¶
sequenceDiagram
participant Dev as Developer
participant GH as GitHub
participant RW as Railway
participant HC as Health Check
Dev->>GH: Push to main
GH->>RW: Webhook trigger
RW->>RW: Build Docker image
RW->>RW: Start new container
RW->>HC: GET /ready
HC-->>RW: 200 OK
RW->>RW: Route traffic to new container
RW->>RW: Stop old container
Health Check Endpoint¶
@router.get("/ready")
async def readiness_check():
"""Railway health check endpoint."""
checks = {
"database": await check_postgres(),
"redis": await check_redis(),
}
all_healthy = all(checks.values())
return JSONResponse(
status_code=200 if all_healthy else 503,
content={"status": "ready" if all_healthy else "degraded", "checks": checks},
)
Railway waits for /ready to return 200 before routing traffic. If the new container
fails health checks, the deployment is rolled back automatically.
Dockerfile Essentials¶
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Frontend: Vercel¶
| Setting | Value |
|---|---|
| Plan | Free (Hobby) |
| Framework | Next.js 14 (App Router) |
| Deploy trigger | Push to main branch |
| Build command | next build |
| Output | Static + Server Components |
| Region | Auto (Edge) |
Vercel provides automatic preview deployments on pull requests and production deployments on merge to main. The frontend communicates with the Railway backend exclusively over HTTPS.
PostgreSQL (Railway Internal)¶
| Setting | Value |
|---|---|
| Host | postgres.railway.internal |
| Port | 5432 |
| Access | Railway internal network only |
| Connection | Via DATABASE_URL env var |
| Backups | Railway automatic daily |
PostgreSQL is the primary relational store for all transactional data: patients, providers, procedures, consents, documents, conversations, and audit logs.
Connection Configuration¶
# Railway provides DATABASE_URL automatically for internal PostgreSQL
DATABASE_URL = os.environ["DATABASE_URL"]
# Example: postgresql://user:pass@postgres.railway.internal:5432/railway
engine = create_async_engine(
DATABASE_URL,
pool_size=10,
max_overflow=5,
pool_pre_ping=True,
)
Internal Networking
Using postgres.railway.internal keeps database traffic within Railway's private
network. The database is not exposed to the public internet.
Neo4j Aura (Free Tier)¶
| Setting | Value |
|---|---|
| Instance | 21eea64f.databases.neo4j.io |
| Plan | Free (AuraDB) |
| Limits | 200,000 nodes, 400,000 relationships |
| Protocol | Bolt (neo4j+s://) |
| Use case | Provider-procedure-specialty graph |
Neo4j stores the relationship graph between providers, procedures, doctors, specialties, and geographic regions. It powers the semantic matching engine that connects patients to the right providers.
Current Graph Statistics¶
| Node Type | Count | Key Properties |
|---|---|---|
| Provider | 42 | name, country, city, tier |
| Procedure | 12 | name, category, parent |
| Doctor | 8 | name, specialty, verified |
| Specialty | ~20 | name, category |
| Country | 8 | name, iso_code |
Connection¶
from neo4j import AsyncGraphDatabase
driver = AsyncGraphDatabase.driver(
"neo4j+s://21eea64f.databases.neo4j.io",
auth=(NEO4J_USER, NEO4J_PASSWORD),
)
Qdrant Cloud (Free Tier)¶
| Setting | Value |
|---|---|
| Plan | Free (1GB storage) |
| Region | Europe West 3 |
| Embedding model | Voyage AI voyage-3.5-lite |
| Dimensions | 1024 |
| Distance metric | Cosine |
Qdrant stores vector embeddings for semantic search across providers, medical conditions, and travel requirements.
Collections¶
| Collection | Vectors | Purpose |
|---|---|---|
| providers | 42 | Provider profile embeddings |
| conditions | 12 | Medical condition/procedure embeddings |
| requirement_embeddings | 70 | Travel requirement semantic search |
Connection¶
from qdrant_client import QdrantClient
qdrant = QdrantClient(
url=QDRANT_URL, # https://xxx.europe-west3-0.gcp.cloud.qdrant.io
api_key=QDRANT_API_KEY,
)
Upstash Redis (Free Tier)¶
| Setting | Value |
|---|---|
| Plan | Free (10,000 commands/day) |
| Protocol | HTTPS (REST API) |
| Use cases | Pub/sub messaging, response caching |
Two Usage Patterns¶
1. Pub/Sub Queue (RPUSH/LPOP)
Used for real-time message passing between services:
# Producer: Push message to queue
await redis.rpush(f"queue:{session_id}", json.dumps(message))
# Consumer: Pop message from queue
raw = await redis.lpop(f"queue:{session_id}")
2. Cache (GET/SET with TTL)
Used to cache expensive computations and avoid redundant API calls:
# Cache with 5-minute TTL
await redis.set(f"cache:provider:{provider_id}", json.dumps(data), ex=300)
# Read from cache
cached = await redis.get(f"cache:provider:{provider_id}")
Four Hot Paths¶
| Hot Path | Pattern | TTL | Purpose |
|---|---|---|---|
| Flagsmith flags | GET/SET | 300s | Avoid per-request Flagsmith API calls |
| Provider search results | GET/SET | 600s | Cache semantic search results |
| Conversation context | GET/SET | 1800s | Avoid re-fetching full history |
| Real-time message queue | RPUSH/LPOP | N/A | Streaming response delivery |
Upstash QStash (Free Tier)¶
| Setting | Value |
|---|---|
| Plan | Free (500 messages/day) |
| Protocol | HTTPS webhooks |
| Use case | Scheduled cron tasks |
QStash delivers HTTP requests to the Railway backend on a schedule. Each task is a POST to a specific endpoint.
Six Cron Tasks¶
| Task | Schedule | Endpoint | Purpose |
|---|---|---|---|
| Exchange rate refresh | Daily 00:00 UTC | POST /cron/exchange-rates |
Fetch latest rates from Frankfurter API |
| Intake reminders | Hourly | POST /cron/intake-reminders |
Nudge patients with incomplete intakes |
| Stale session cleanup | Daily 03:00 UTC | POST /cron/cleanup-sessions |
Archive sessions idle >30 days |
| Consent expiry check | Daily 09:00 UTC | POST /cron/consent-expiry |
Notify patients with expiring consents |
| Notification digest | Daily 08:00 UTC | POST /cron/notification-digest |
Batch and send daily email digests |
| Analytics refresh | Every 2 hours | POST /cron/analytics-refresh |
Recompute dashboard metrics |
QStash Verification¶
from upstash_qstash import Receiver
receiver = Receiver(
current_signing_key=QSTASH_CURRENT_SIGNING_KEY,
next_signing_key=QSTASH_NEXT_SIGNING_KEY,
)
@router.post("/cron/{task_name}")
async def handle_cron(task_name: str, request: Request):
body = await request.body()
signature = request.headers.get("upstash-signature")
receiver.verify(body=body, signature=signature, url=str(request.url))
# Process the cron task...
Cloudflare R2 (Object Storage)¶
| Setting | Value |
|---|---|
| Plan | Free (10GB storage, zero egress) |
| Protocol | S3-compatible API |
| Use case | Patient document uploads |
| Upload method | Presigned URLs |
Why R2 Over S3?¶
Zero egress fees. When patients download their documents (medical records, visa copies, insurance paperwork), there is no per-GB charge. For a medical travel platform where documents are uploaded once and downloaded many times, this significantly reduces costs.
Presigned URL Flow¶
sequenceDiagram
participant FE as Frontend
participant BE as Backend
participant R2 as Cloudflare R2
FE->>BE: POST /documents/presign {filename, size, type}
BE->>BE: Validate metadata (ext, MIME, size)
BE->>R2: Generate presigned URL
R2-->>BE: Presigned PUT URL (15min expiry)
BE-->>FE: {upload_url, document_key}
FE->>R2: PUT file directly to R2
R2-->>FE: 200 OK
FE->>BE: POST /documents/confirm {document_key}
BE->>BE: Record document in PostgreSQL
Cloudflare DNS¶
| Setting | Value |
|---|---|
| Mode | DNS-only (grey cloud) |
| Reason | Proxy mode returns HTML challenges |
Grey Cloud Required
Cloudflare proxy mode (orange cloud) intercepts API requests and may return HTML challenge pages instead of JSON responses. This breaks the mobile and web clients. DNS-only mode passes traffic directly to Railway/Vercel without interference.
Email: Resend¶
| Setting | Value |
|---|---|
| Plan | Free (3,000 emails/month) |
| From domain | Configured via Cloudflare DNS |
| Templates | match_ready, intake_reminder, consent_expiring |
GitHub Repositories¶
| Repository | Purpose | Deploy Target |
|---|---|---|
whoinsane/curaway |
Backend (FastAPI) | Railway |
whoinsane/curaway-health-navigator |
Frontend (Next.js) | Vercel |
Both repositories deploy automatically on push to main.
CORS Configuration¶
ALLOWED_ORIGINS = [
"http://localhost:3000", # Local frontend dev
"https://curaway.vercel.app", # Production frontend
"https://curaway-health-navigator.vercel.app", # Alternative frontend URL
]
app.add_middleware(
CORSMiddleware,
allow_origins=ALLOWED_ORIGINS,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Environment Variables Reference¶
All environment variables required by the backend. Values are set in Railway's dashboard.
| Variable | Service | Purpose |
|---|---|---|
DATABASE_URL |
PostgreSQL | Primary database connection |
NEO4J_URI |
Neo4j Aura | Graph database URI |
NEO4J_USER |
Neo4j Aura | Graph database username |
NEO4J_PASSWORD |
Neo4j Aura | Graph database password |
QDRANT_URL |
Qdrant Cloud | Vector database URL |
QDRANT_API_KEY |
Qdrant Cloud | Vector database key |
UPSTASH_REDIS_URL |
Upstash Redis | Redis REST URL |
UPSTASH_REDIS_TOKEN |
Upstash Redis | Redis auth token |
QSTASH_CURRENT_SIGNING_KEY |
Upstash QStash | Webhook verification |
QSTASH_NEXT_SIGNING_KEY |
Upstash QStash | Webhook verification (rotation) |
CLERK_SECRET_KEY |
Clerk | Auth verification |
CLERK_PUBLISHABLE_KEY |
Clerk | Frontend auth |
R2_ACCOUNT_ID |
Cloudflare R2 | R2 account identifier |
R2_ACCESS_KEY_ID |
Cloudflare R2 | S3-compatible access key |
R2_SECRET_ACCESS_KEY |
Cloudflare R2 | S3-compatible secret key |
R2_BUCKET |
Cloudflare R2 | Bucket name |
RESEND_API_KEY |
Resend | Email sending |
ANTHROPIC_API_KEY |
Anthropic | Claude LLM access |
VOYAGE_API_KEY |
Voyage AI | Embedding model access |
LANGFUSE_PUBLIC_KEY |
Langfuse | Tracing public key |
LANGFUSE_SECRET_KEY |
Langfuse | Tracing secret key |
LANGFUSE_HOST |
Langfuse | Tracing host URL |
POSTHOG_API_KEY |
PostHog | Analytics key |
FLAGSMITH_API_KEY |
Flagsmith | Feature flags |
FRANKFURTER_API_URL |
Frankfurter | Exchange rate API base URL |
ENCRYPTION_KEY |
Internal | Fernet key for field-level encryption |
Secret Management
Never commit environment variable values to source control. All secrets are managed through Railway's encrypted environment variable storage and injected at runtime.
Cost Summary¶
| Service | Monthly Cost | Free Tier Limit |
|---|---|---|
| Railway Pro | $20 | N/A (paid plan) |
| Vercel | $0 | 100GB bandwidth |
| Neo4j Aura | $0 | 200K nodes |
| Qdrant Cloud | $0 | 1GB storage |
| Upstash Redis | $0 | 10K commands/day |
| Upstash QStash | $0 | 500 messages/day |
| Cloudflare R2 | $0 | 10GB storage |
| Resend | $0 | 3K emails/month |
| Clerk | $0 | 10K MAU |
| Total | ~$20/mo |