ADR-0012: SSE for Document Upload Progress¶
Status: Accepted Date: 2026-04-05 Session: 32
Context¶
Patients upload medical documents that take 5-30 seconds to fully process (OCR → clinical extraction → requirement matching). Without feedback, the UI showed a static spinner for the entire duration. There was no way to distinguish "OCR running" from "stuck / failed", and users sometimes navigated away or re-uploaded the same file thinking nothing was happening.
A real-time progress mechanism was needed. The document processing pipeline already had defined stages (upload received, OCR started, OCR complete, analysis started, analysis complete, matching complete), but none of these emitted observable events to the frontend.
Decision¶
Extend the existing /api/v1/patients/{id}/documents/stream SSE endpoint with a progress event type, fed by a Redis channel (doc_progress:{patient_id}). Each pipeline stage calls emit_progress() to push a ProgressEvent to Redis. The SSE endpoint polls Redis at 300ms and forwards progress events to the connected browser.
Rationale¶
- Reuses existing SSE infrastructure. ADR-0004 established SSE + Redis pub/sub for chat streaming. Extending the same pattern to document progress avoids a new transport layer. The endpoint was already open for DB-status polling; adding a fast Redis channel costs nothing structurally.
- Dual-source stream is safe to degrade. Redis errors are silently swallowed — the stream falls back to 5s DB polling. The user still sees final
document_updateanddoneevents even if all progress events are lost. - Shared pipeline eliminates code drift. Both the inline OCR fast path (
confirm_upload) and the QStash async callback callrun_post_ocr_pipeline(), which callsemit_progress(). Progress events fire from both paths without duplication. - Frontend step display is purely cosmetic. If Redis is slow or a step event is missed, the step indicator falls back to the last known state. No state machine on the frontend — just rendering from the latest event list.
- Heartbeat keeps long connections alive. Proxies (Railway → Vercel, Cloudflare CDN) may drop idle SSE connections. A 5s heartbeat event prevents this without adding backend complexity.
Alternatives Considered¶
| Alternative | Pros | Cons | Verdict |
|---|---|---|---|
| WebSocket | Bidirectional, lower latency | Overkill for one-way status updates; adds server configuration complexity | Rejected |
| Polling from frontend | Simple, no SSE infrastructure | Adds DB load per client; higher latency between steps; poor UX (steps appear in jumps) | Rejected |
| QStash callback to frontend | No open connection | Requires browser-accessible webhook URL; not feasible from browser context | Rejected |
| Progress in confirm_upload response | Zero infrastructure | Only reports status at a single point in time; no mid-processing visibility | Rejected |
Consequences¶
- Positive: Patients see step-by-step progress (6 stages) as their document is processed. Perceived wait time is significantly lower than a static spinner.
- Positive: The
detailfield inProgressEventcan carry user-safe status messages without exposing PHI. - Positive: Langfuse span is created per progress event, enabling pipeline stage timing analysis in production.
- Negative: Redis must be available for progress events to fire. When Redis is down, the stream degrades to DB polling only — users see no step progress, just the final state change.
- Negative: The SSE connection is held open for up to 5 minutes per upload session. At current scale (POC) this is not a concern. At 1,000 concurrent uploads it would consume significant Railway container file descriptors.
- Accepted risk: If
emit_progress()is called but the browser has already disconnected (user navigated away), the push is a no-op (Redis TTL handles cleanup). No resource leak.