Data Flow Map¶
Every PHI-bearing surface in Curaway, with where data lives today, what gates it, and what changes on GCP.
Audience: Engineering team + compliance review (BAA / HIPAA scoping for the GCP target environment). Read with the sequence diagrams which show how data moves between these surfaces over time.
1. PHI residency — Postgres tables¶
Curaway uses PostgreSQL Row-Level Security (RLS) as defense-in-depth on top of application-level tenant scoping (the DAO layer in app/repositories/). The RLS policy tenant_isolation on each table compares row's tenant_id to current_setting('app.tenant_id'), which the API sets per-request.
Migration enabled in Alembic 785ab8660c94.
Tables with RLS (direct tenant_id) — 13 tables hold PHI¶
| Table | PHI content | Notes |
|---|---|---|
patients |
Name, contact info, demographics | Core identity table |
cases |
Layer state (medical / financial / family-support / etc.), procedure intent | The largest PHI carrier |
fhir_resources |
FHIR R4 JSON: Conditions, Procedures, Observations, MedicationRequests | One row per resource per case |
document_references |
Filename, content type, R2 storage_key, OCR text snippet | Full document body lives in R2 |
consent_records |
Consent text hash, IP address, signed-at | Append-only |
conversations |
Conversation pointer per case | Joins to messages |
messages |
Patient + agent message bodies | RLS via conversation_id join |
match_results |
Match score breakdown per (case, provider, doctor) | Includes scoring inputs |
notifications |
Email/push body for individual users | |
consultations |
Booked teleconsultation slots | |
feedback_records |
CSAT ratings, free-text feedback | |
device_registrations |
Push notification tokens per patient | |
data_forwarding_audits |
Provenance log of every cross-tenant case forward | Append-only |
System tables — no RLS (cross-tenant catalogs and ops)¶
| Table | What it holds | PHI? |
|---|---|---|
audit_logs |
RBAC + admin actions (actor_id only, no PHI body) | No |
events |
Internal event bus (LLM calls, agent steps) — payload may carry case_id but never raw clinical text | Indirect (case_id) |
providers, provider_procedures, provider_facilities |
Provider catalog | No |
doctors, doctor_procedures |
Doctor catalog | No |
treatment_categories, procedure_requirements |
Clinical taxonomy | No |
consent_purposes, notification_templates, notification_preferences |
Reference data | No |
idempotency_keys |
Request dedup | No |
legal_agreements, user_agreement_acceptances |
Terms acceptance log | Indirect (user_id) |
tenants, tenant_settings |
Tenant catalog | No |
roles, user_roles, tenant_org_mappings |
RBAC + Clerk-org-to-tenant map | Indirect (user_id) |
Migration target¶
| Today | GCP target | Compliance impact |
|---|---|---|
| Railway PostgreSQL (managed, single-region us-east) | Cloud SQL Postgres with regional HA + automatic backups | Cloud SQL is HIPAA-eligible with BAA. Region selection matters for cross-border data residency (US → EU patient routes are in-scope for GDPR). |
| RLS policies | Identical SQL — no migration changes | Carry forward as-is |
tenant_isolation policy expression |
current_setting('app.tenant_id') |
Same in Cloud SQL Postgres |
curaway_app non-superuser role |
Re-create on Cloud SQL with same grants | Move credentials to Secret Manager |
2. PHI residency — non-Postgres stores¶
| Store | Today | What it holds | GCP target |
|---|---|---|---|
| Cloudflare R2 | Cloudflare (us-east) | Document bytes (PDFs, images) — uploaded directly via presigned URL, never traverses Curaway API | GCS (bucket per tenant or single bucket with object-prefix per tenant + IAM conditions) |
| Neo4j AuraDB | Neo4j managed (Cloud-hosted) | Clinical knowledge graph: Patient → Condition → Procedure → Provider → Outcome → Cost → Location nodes. Patient nodes carry patient_id (UUID, not PII), but edges to Conditions/Procedures are PHI |
Decision: keep AuraDB with BAA, or self-host on GKE. Edge latency matters — Aura's region must be in-VPC of GCP target. |
| Qdrant Cloud | Qdrant managed (Cloud-hosted) | Vector embeddings of provider profiles + ICD-10 condition descriptions. No patient embeddings stored; query-time only. | Either keep Qdrant Cloud (with BAA) or migrate to Vertex AI Vector Search. Migration requires re-indexing under Vertex's schema; embedding model (Voyage AI) stays the same. |
| Upstash Redis | Upstash managed | Patient state (60s TTL), conversation context (120s TTL), FHIR cache (300s TTL), org→tenant cache (300s TTL). All TTL-bounded; transient PHI | Memorystore (Redis-compatible). Same client lib. |
| Langfuse Cloud | Langfuse managed | LLM call traces: prompt + completion + token counts + cost. Carries case_id, patient_id, raw clinical text in prompts | Decision required: keep Langfuse Cloud (verify BAA), self-host Langfuse on GKE, or migrate observability to Cloud Logging + Vertex AI. The traces are the highest-PHI single export. |
3. External egress — what data leaves the platform¶
| Destination | Endpoint | Triggered by | Data sent | PHI? |
|---|---|---|---|---|
| Anthropic | api.anthropic.com |
Every LLM call (orchestrator, triage, intake, match, explain, extractor) | Patient message + conversation history + system prompts that carry clinical context | Yes — full PHI |
| OpenAI (fallback) | api.openai.com |
LLM gateway fallback on Anthropic 5xx/429/timeout | Same shape as Anthropic call that failed | Yes — full PHI |
| Voyage AI | api.voyageai.com |
Embedding generation for provider/condition descriptions | Provider/condition text only — no patient text embedded | No |
| Clerk | api.clerk.com |
JWT verification (JWKS fetch), org/user lookups, webhooks inbound | User IDs, org IDs, no clinical data | No |
| Flagsmith | edge.api.flagsmith.com (runtime SDK), api.flagsmith.com (admin proxy) |
Per-request flag evaluation | tenant_id sometimes used as identifier for per-tenant overrides | No (tenant_id is not PHI) |
| Cloudflare R2 | <account>.r2.cloudflarestorage.com |
Document upload (browser direct), document read (worker) | Document bytes | Yes — full PHI |
| Neo4j AuraDB | <instance>.databases.neo4j.io |
Clinical graph queries | Patient ID + clinical entity lookups | Yes — clinical graph |
| Qdrant Cloud | <cluster>.qdrant.io |
Vector search for matching | Query embedding (derived from clinical context) | Indirect — embedding is one-way derived from PHI but not reversible |
| Upstash Redis | <endpoint>.upstash.io (HTTPS) |
Cache get/set | Patient state + conversation context | Yes — transient PHI (TTL-bounded) |
| Upstash QStash | qstash.upstash.io |
Async task enqueue | document_id, case_id (no clinical text in task body) | Indirect (id only) |
| Langfuse | cloud.langfuse.com |
LLM trace export (every llm_gateway call) | Full prompt + completion + case_id + patient_id | Yes — full PHI in trace bodies |
| Daily.co | api.daily.co (room creation), WebRTC media |
Video consultation | Room metadata via API; video stream PHI via WebRTC | Yes — A/V stream |
| Frankfurter | api.frankfurter.app |
Daily currency rate refresh | None | No |
| Telegram | api.telegram.org |
Operator alerts | Alert title + truncated context (case_id allowed, no clinical text) | Indirect (id only) |
| Axiom | api.axiom.co |
Application log shipping | Stdout logs | Should be PHI-free — verify via log policy |
| Email provider (TBD: Resend/SES) | varies | Notifications, intake reminders, redacted forwarding packets | Patient name + case summary (or redacted packet for forwarding) | Yes — controlled PHI |
| PostHog | app.posthog.com |
Behavioral analytics | patient_id (UUID only), event names, no clinical content | Indirect (id only) — verify via instrumentation audit |
Migration decision matrix for external dependencies¶
| Dependency | Action | Rationale |
|---|---|---|
| Anthropic | Optionally route via Vertex AI Anthropic | Vertex provides BAA + in-VPC traffic. Costs ~5% premium; eliminates public-internet egress for LLM PHI |
| OpenAI | Keep via public internet (fallback only, low volume) | Vertex doesn't host OpenAI models; the fallback path is rare |
| Voyage AI | Keep public — no PHI sent | |
| Clerk | Keep — SaaS, BAA available | |
| Flagsmith | Keep — no PHI sent (only tenant_id) | |
| R2 | Migrate → GCS | Bucket-per-tenant or prefix isolation; Cloud Storage HIPAA-eligible |
| Neo4j Aura | Keep with BAA, or self-host on GKE | If Aura's region is in-GCP-VPC range, BAA is the lighter path |
| Qdrant Cloud | Keep with BAA, or → Vertex AI Vector Search | Re-indexing cost vs in-VPC traffic |
| Upstash Redis | Migrate → Memorystore | Free tier on Memorystore covers Curaway's volume |
| Upstash QStash | Migrate → Cloud Tasks + Cloud Scheduler | See async pipelines doc |
| Langfuse | Decision pending — likely self-host on GKE | Trace bodies are the highest-PHI export; BAA should be verified or paths restricted |
| Daily.co | Keep — SaaS, BAA available | Video PHI routed via Daily's HIPAA-mode rooms |
| Frankfurter | Keep public — no PHI | |
| Telegram | Keep public — no PHI in alert content (verify via alert audit) | |
| Axiom | Verify log scrubbing policy; keep or migrate to Cloud Logging | If Cloud Logging used, BigQuery export gives equivalent query power |
| Email provider | Verify HIPAA-eligible plan / BAA | |
| PostHog | Verify no clinical content in events; otherwise keep |
4. Cross-tenant data flows¶
These are the only legitimate cross-tenant data crossings in Curaway. Every other path enforces tenant isolation via RLS + DAO scoping.
4a. Case forwarding — patient → providers (multi-tenant)¶
When a patient consents to forwarding, a redacted case packet crosses from the patient's tenant (e.g. tenant-apollo-001) to one or more provider tenants. The redaction engine (app/services/redaction_engine.py) strips:
- Patient name + DOB
- Contact info (email, phone, address)
- Government IDs (SSN, passport)
- Insurance numbers
What survives the redaction:
- Anonymized patient ID (case-scoped, not the canonical patient_id)
- Clinical content (diagnosis, procedures, lab values, imaging summaries)
- Preferences (language, travel constraints, budget range)
- Generated case_share row links the original case to the forwarded snapshot
Audit: every forwarding event writes to data_forwarding_audits (append-only).
4b. MSO consultation — patient → MSO doctor¶
When a patient opts for a Medical Second Opinion, a doctor on tenant-mso-panel reads the full clinical record (no redaction) and writes back an opinion. The MSO doctor's BAA must cover this access.
4c. Admin portal — cross-tenant queries¶
Platform admins (role platform_admin on tenant-curaway-admin) can query across tenants for: flag management, matching config, audit log inspection. The admin endpoints (/api/v1/admin/flags/*, /api/v1/admin/matching/config) do not return PHI — only configuration state.
The admin tenant has no patient data of its own.
4d. Coordinator handoffs¶
Coordinators on tenant-curaway-ops can view cases for any patient tenant they're assigned to. The assignment is per-case (recorded in cases.coordinator_user_id); coordinators don't have blanket read access to a tenant.
5. Data lifecycle and erasure¶
Retention defaults¶
| Data class | Retention | Where enforced |
|---|---|---|
Patient PII (patients) |
Indefinite while account active | Manual delete on account closure (TODO: GDPR Article 17 cascade scripted) |
Clinical records (fhir_resources, cases) |
Indefinite (medical record retention statutory) | DB-level |
Documents (R2 + document_references) |
Indefinite while case open; cleaned per region statute | R2 lifecycle policies (today: not configured — TODO) |
Conversations (messages) |
Indefinite | DB-level |
| LLM traces (Langfuse) | 30 days default | Langfuse retention setting |
| Redis cache | 60s–300s TTL | Redis client config |
Events (events) |
Indefinite — used for compliance trail | DB-level |
Audit logs (audit_logs) |
Indefinite, append-only — even super-admins cannot delete | DB triggers |
GDPR Article 17 (right to erasure)¶
Cascade across:
1. patients → cases → fhir_resources → document_references → R2 objects
2. patients → conversations → messages
3. patients → consultations → Daily.co room deletion (manual API call)
4. patients → consent_records (mark deleted, don't drop — legal requirement)
5. patients → Langfuse traces (manual trace-delete API)
6. patients → Neo4j Patient node + :HAS_CONDITION / :UNDERWENT_PROCEDURE edges (Cypher cascade)
7. Always-retained (anonymized): aggregated outcome data without patient_id
Today: cascade is partially scripted — Postgres + R2 + Neo4j parts done; Langfuse + Daily.co are manual. Migration target: complete the script before cutover.
Tombstone / deletion certificate¶
Every erasure produces a certificate (signed PDF) sent to the patient and stored in data_forwarding_audits with event_type=gdpr_erasure_executed. Already implemented today; no migration change.
6. Encryption + key management¶
| Layer | Today | GCP target |
|---|---|---|
| At-rest (Postgres) | Railway provided AES-256 | Cloud SQL CMEK (Cloud KMS-backed). Tenant-scoped keys optional. |
| At-rest (R2) | R2 server-side encryption | GCS CMEK |
| At-rest (Neo4j Aura) | Aura-provided | Same on Aura, or BYOK if self-host |
| In-transit | TLS 1.2+ everywhere | Same |
| Field-level (sensitive PII columns) | Not implemented — fields stored plaintext in JSONB | TODO regardless of migration. Use Cloud KMS encrypt envelope on write, decrypt on read. Affects: patients.contact, patients.dob, consultations.notes |
| LLM trace bodies (Langfuse) | TLS in transit; at-rest per Langfuse | If self-hosted, Cloud KMS at rest |
| Secrets | Railway env vars | Secret Manager with auto-rotation policy |
7. Pre-migration data-flow checklist¶
For the migration team to validate before cutover:
- [ ] Confirm
tenant_isolationRLS policies survive the Postgres dump/restore (orpg_dump --enable-row-securityis used) - [ ] Re-create
curaway_appnon-superuser role on Cloud SQL with identical grants - [ ] Bucket-per-tenant naming convention chosen on GCS (or document the prefix-isolation policy)
- [ ] Memorystore TTL behavior verified to match Upstash (especially MULTI/EXEC and TTL-on-EXISTS)
- [ ] Neo4j connection string + region picked; latency from Cloud Run measured
- [ ] Qdrant decision finalised (keep Cloud or migrate to Vertex)
- [ ] Langfuse decision finalised (BAA / self-host / replace)
- [ ] Vertex AI Anthropic vs public Anthropic decision per service (chat / extractor / matching)
- [ ] All managed-SaaS BAAs in place (Clerk, Daily.co, Email, Anthropic if not Vertex)
- [ ] GDPR cascade script extended to cover the targets we keep on managed SaaS
Code references¶
- RLS migration:
alembic/versions/785ab8660c94_enable_rls_on_patient_data_tables.py - Redaction engine:
app/services/redaction_engine.py - Models with
tenant_id: seeapp/models/*.pyfor the 13 RLS tables - DAO base classes:
app/repositories/base.py—BaseRepository._scoped_query()raisesTenantIsolationViolationiftenant_idis empty - ADR references: ADR-0018 (multi-tenancy), ADR-0019 (GDPR erasure cascade), data governance section in Architecture/02-data-model