Skip to content

Document Viewer — Feature Spec

Context

You are working on both the Curaway backend (curaway_src repo) and the Curaway Health Navigator frontend (curaway-health-navigator repo). Read CLAUDE.md in both repos before starting. Read the steer document at docs/specs/ai-steer/document-viewer-steer.md for design rationale.

Issue: #119 Steer: docs/specs/ai-steer/document-viewer-steer.md

Branch

# Backend
cd curaway_src
git checkout main && git pull origin main
git checkout -b feature/document-viewer

# Frontend
cd curaway-health-navigator
git checkout main && git pull origin main
git checkout -b feature/document-viewer

What You're Building

A document viewer that lets patients view and download their uploaded medical documents. The backend exposes a download endpoint that returns a presigned R2 GET URL. The frontend adds a DocumentViewer component (modal with iframe/img) and wires "View" buttons into the DocsPanel and RequirementsChecklist.


PART A: Backend Changes

A1: New Schema — DocumentDownloadResponse (Sonnet)

File: app/schemas/document.py

Add a new Pydantic model after the existing DocumentRead class:

class DocumentDownloadResponse(BaseModel):
    """Presigned download URL for viewing/downloading a document from R2."""

    download_url: str = Field(
        ...,
        description="Presigned GET URL for Cloudflare R2. Expires after `expires_in` seconds. "
        "Use as iframe src (PDF) or img src (images). For download, set Content-Disposition via the `download` param.",
    )
    content_type: str = Field(
        ...,
        description="MIME type of the document (e.g. 'application/pdf', 'image/jpeg'). "
        "Read from document metadata, not user-provided.",
    )
    filename: str = Field(
        ...,
        description="Original filename for display and Content-Disposition.",
    )
    document_id: str = Field(
        ...,
        description="Document reference UUID.",
    )
    expires_in: int = Field(
        default=900,
        description="URL expiry in seconds (default 900 = 15 minutes).",
    )

    model_config = {
        "json_schema_extra": {
            "examples": [
                {
                    "download_url": "https://account.r2.cloudflarestorage.com/curaway-documents/tenant-001/patient-001/doc-001/knee-xray.pdf?X-Amz-Expires=900&...",
                    "content_type": "application/pdf",
                    "filename": "knee-xray-2026.pdf",
                    "document_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
                    "expires_in": 900,
                }
            ]
        }
    }

A2: New Service Function — get_download_url() (Opus)

File: app/services/document_service.py

Add after the existing get_document() function:

async def get_download_url(
    db: AsyncSession,
    document_id: str,
    patient_id: str,
    tenant_id: str,
    actor_id: str = "system",
) -> dict:
    """Generate a presigned download URL for a document.

    Validates ownership (patient_id + tenant_id), generates a presigned
    GET URL via R2, and emits a document.viewed event + audit log.

    Returns a dict matching DocumentDownloadResponse fields.
    Raises ValueError if document not found or not owned by this patient/tenant.
    """
    doc = await get_document(db, document_id, patient_id, tenant_id)
    if not doc:
        raise ValueError(f"Document {document_id} not found for patient {patient_id}")

    # Generate presigned GET URL from R2
    download_url = r2_client.generate_download_url(
        key=doc.storage_key,
        expires_in=900,
    )

    if not download_url:
        # R2 not configured — return placeholder (same pattern as upload presign)
        from app.config import settings
        download_url = (
            f"https://{settings.r2_bucket_name}.r2.cloudflarestorage.com"
            f"/{doc.storage_key}?X-Amz-Expires=900"
        )

    # Audit logging — document.viewed event
    now = datetime.now(timezone.utc)

    db.add(
        AuditLog(
            id=str(uuid.uuid4()),
            actor_id=actor_id,
            actor_type="user",
            action="document.viewed",
            resource_type="DocumentReference",
            resource_id=doc.id,
            tenant_id=tenant_id,
            changes={
                "filename": doc.original_filename,
                "content_type": doc.mime_type,
                "access_type": "presigned_url",
            },
        )
    )

    db.add(
        Event(
            id=str(uuid.uuid4()),
            event_type="document.viewed",
            tenant_id=tenant_id,
            actor_id=actor_id,
            patient_id=patient_id,
            payload={
                "document_id": doc.id,
                "content_type": doc.mime_type,
                "filename": doc.original_filename,
                "access_type": "presigned_url",
            },
            source_service="document_api",
        )
    )

    await db.commit()

    return {
        "download_url": download_url,
        "content_type": doc.mime_type,
        "filename": doc.original_filename,
        "document_id": doc.id,
        "expires_in": 900,
    }

Opus review points: - Verify the get_document() triple-filter (document_id + patient_id + tenant_id + is_deleted=False) is sufficient for access control. - Confirm the audit log pattern matches confirm_upload() (it should -- same Event + AuditLog pair). - Decide whether actor_id should come from the Clerk JWT claims or default to "system". Recommendation: pass Clerk sub claim when available.

A3: New Endpoint — GET .../documents/{document_id}/download (Sonnet)

File: app/routers/documents.py

Add after the existing list_documents endpoint:

from app.schemas.document import DocumentDownloadResponse

@router.get(
    "/{document_id}/download",
    response_model=APIResponse[DocumentDownloadResponse],
    summary="Get a presigned download URL for a document",
    description=(
        "Returns a time-limited presigned GET URL for downloading or previewing "
        "a document stored in Cloudflare R2. The URL expires after 15 minutes. "
        "Each call generates a fresh URL and logs a `document.viewed` event.\n\n"
        "**Access control:** The document must belong to the specified patient_id "
        "and the request's tenant_id. Returns 404 if not found or not owned."
    ),
    responses={
        200: {"description": "Presigned download URL generated successfully"},
        404: {"description": "Document not found or not owned by this patient/tenant"},
    },
)
async def get_download_url(
    patient_id: str,
    document_id: str,
    tenant_id: str = Depends(_get_tenant_id),
    db: AsyncSession = Depends(get_db),
):
    """Generate a presigned R2 GET URL for document preview/download."""
    await _require_patient(patient_id, tenant_id, db)

    try:
        result = await document_service.get_download_url(
            db=db,
            document_id=document_id,
            patient_id=patient_id,
            tenant_id=tenant_id,
        )
    except ValueError:
        raise HTTPException(
            status_code=404,
            detail=APIResponse(
                success=False,
                errors=[
                    ErrorDetail(
                        code="STORAGE_DOCUMENT_NOT_FOUND_001",
                        message=f"Document {document_id} not found for this patient",
                    )
                ],
            ).model_dump(mode="json"),
        )

    return APIResponse(
        success=True,
        data=DocumentDownloadResponse(**result),
    )

Import additions at top of file:

from app.schemas.document import DocumentDownloadResponse

A4: New Event Type — document.viewed (Sonnet)

File: app/services/decision_recorder.py

No code changes needed. The document.viewed event follows the existing pattern used by document.uploaded in confirm_upload(). The event_type is a free-form string on the Event model. Add a comment to the Event model docstring noting the new event type:

File: app/models/event.py (line 28-29, update the comment)

Update the event_type comment to include document.viewed:

    # e.g. "intake.started", "intake.completed", "match.executed",
    #      "consent.granted", "notification.sent", "document.uploaded",
    #      "document.viewed"


PART B: Frontend Changes

B1: New Component — DocumentViewer.tsx (Opus)

File: src/components/documents/DocumentViewer.tsx

A modal/drawer component that displays document content using the presigned URL.

Props:

interface DocumentViewerProps {
  isOpen: boolean;
  onClose: () => void;
  documentId: string;
  patientId: string;
  filename: string;
  contentType: string;
  uploadedAt?: string;
}

Behavior: 1. When isOpen becomes true, call getDownloadUrl(patientId, documentId) to fetch the presigned URL. 2. Show a loading skeleton while the URL is being fetched. 3. Based on contentType: - application/pdf: Render an <iframe> on desktop (viewport >= 768px). On mobile, open the URL in a new tab via window.open() and close the modal. - image/jpeg, image/png, image/tiff: Render an <img> with object-fit: contain. Support pinch-to-zoom (CSS touch-action: pinch-zoom) and scroll-wheel zoom. - Other: Show a download-only view with a download button (no inline preview). 4. Header: filename (truncated) + upload date + close button (X). 5. Footer: "Download" button that opens the URL with Content-Disposition: attachment (append &response-content-disposition=attachment to the presigned URL, or use a separate download link). 6. Error state: If the URL fetch fails (404, network error), show an error message with a retry button.

Layout: - Overlay: fixed, inset-0, z-50, bg-black/40, click backdrop to close. - Modal: centered, max-w-4xl, max-h-[90vh], white background, rounded corners. - On mobile: full-screen (inset-0, no rounded corners). - Close on Escape key. - Prevent body scroll when open. - Use CSS variable tokens for dark mode compatibility (bg-chat-surface, text-chat-text, border-chat-border).

Zoom controls (images only): - Zoom in / zoom out buttons (+ / -) in the header or footer. - Scroll wheel zoom with transform: scale(). - Pan via cursor: grab + mouse drag when zoomed in. - Reset zoom button.

PostHog events: - document_viewed with { document_id, content_type } when the viewer opens. - document_downloaded with { document_id } when the download button is clicked.

B2: Update DocsPanel — Add "View" Button (Sonnet)

File: src/components/panels/DocsPanel.tsx

In the RequirementCard component, when a document is uploaded (i.e., doc is not null and status is uploaded, analyzed, completed, or has_issues):

  1. Add an Eye icon button (from lucide-react) to the right of the filename row.
  2. On click, open the DocumentViewer with the document's metadata.
  3. The DocumentViewer needs documentId, patientId, filename, and contentType. These come from the doc object in the checklist response.

Changes to RequirementCard: - Import Eye from lucide-react. - Import DocumentViewer from @/components/documents/DocumentViewer. - Add state: const [viewerOpen, setViewerOpen] = useState(false). - Add the Eye button after the filename text. - Render <DocumentViewer> conditionally when viewerOpen is true.

Note: The checklist API response (doc object) must include document_id and content_type. Verify the backend GET /cases/{id}/document-checklist endpoint returns these fields. If not, add them to the checklist response.

B3: Update RequirementsChecklist — Add View Action (Sonnet)

File: src/components/RequirementsChecklist.tsx

This component shows procedure requirements (before travel / on-site). It does not currently show uploaded documents inline. However, when this component is used in a context where documents have been matched to requirements, add a "View" link.

Scope: This is a lower-priority change. The RequirementsChecklist is primarily used on the public storefront procedure pages where no patient documents exist. The DocsPanel (B2) is the primary entry point. Defer this to a follow-up if it adds complexity.

B4: New API Function — getDownloadUrl() (Sonnet)

File: src/services/documentApi.ts (or add to the existing API client)

export async function getDownloadUrl(
  patientId: string,
  documentId: string,
): Promise<{
  download_url: string;
  content_type: string;
  filename: string;
  document_id: string;
  expires_in: number;
}> {
  const response = await apiClient.get(
    `/api/v1/patients/${patientId}/documents/${documentId}/download`
  );
  return response.data;
}

Use the existing ApiClient pattern from src/lib/api-client.ts which handles: - X-Tenant-ID header injection - Clerk JWT Authorization: Bearer header - Response envelope unwrapping (response.data) - Error handling


PART C: R2 CORS Verification (Sonnet)

Verify the R2 bucket CORS configuration allows GET requests from the frontend domain. The bucket already allows PUT for uploads. Check that the CORS rules include:

[
  {
    "AllowedOrigins": [
      "https://app.curaway.ai",
      "http://localhost:5173",
      "http://localhost:3000"
    ],
    "AllowedMethods": ["GET", "PUT", "HEAD"],
    "AllowedHeaders": ["*"],
    "MaxAgeSeconds": 3600
  }
]

If GET is not in AllowedMethods, add it via the Cloudflare dashboard or the S3-compatible API.

Content-Disposition: For inline preview, the presigned URL should serve with Content-Disposition: inline. For the download button, the frontend can append ?response-content-disposition=attachment%3B%20filename%3D%22report.pdf%22 to the presigned URL. Alternatively, the backend can generate two URLs (one inline, one attachment) -- but a single inline URL is sufficient for MVP since the browser's "Save As" always works.

Content-Type: The presigned URL must include ResponseContentType matching the document's mime_type from metadata. Update r2_client.generate_download_url() to accept an optional content_type parameter:

def generate_download_url(
    key: str,
    expires_in: int = 900,
    content_type: str | None = None,
    content_disposition: str = "inline",
) -> str | None:
    client = _get_client()
    if not client:
        return None

    try:
        params = {
            "Bucket": settings.r2_bucket_name,
            "Key": key,
        }
        if content_type:
            params["ResponseContentType"] = content_type
        if content_disposition:
            params["ResponseContentDisposition"] = content_disposition

        return client.generate_presigned_url(
            "get_object",
            Params=params,
            ExpiresIn=expires_in,
        )
    except (BotoCoreError, ClientError) as e:
        logger.error("Failed to generate download presigned URL: %s", e)
        return None

Test Plan

Backend Unit Tests (Sonnet)

File: tests/test_document_download.py

Test What it verifies
test_download_url_success Valid patient + valid document -> 200 with presigned URL, correct content_type and filename
test_download_url_wrong_tenant Document exists but tenant_id mismatch -> 404
test_download_url_wrong_patient Document exists but patient_id mismatch -> 404
test_download_url_deleted_document Document with is_deleted=True -> 404
test_download_url_nonexistent_document Random UUID -> 404
test_download_url_patient_not_found Invalid patient_id -> 404 (from _require_patient)
test_download_url_event_logged After successful call, verify document.viewed event in events table
test_download_url_audit_logged After successful call, verify AuditLog with action=document.viewed
test_download_url_r2_not_configured When R2 credentials missing, returns placeholder URL (not 500)

Test pattern: Use the existing test fixtures and async_session patterns from tests/test_document_*.py.

Frontend Component Tests (Sonnet)

File: src/components/documents/__tests__/DocumentViewer.test.tsx

Test What it verifies
renders loading state Shows skeleton while fetching URL
renders PDF in iframe For content_type: application/pdf, renders <iframe> with presigned URL as src
renders image inline For content_type: image/jpeg, renders <img> with presigned URL as src
opens new tab on mobile for PDF When viewport < 768px, calls window.open() instead of rendering iframe
shows error on fetch failure When API returns 404, shows error message with retry button
closes on Escape Pressing Escape calls onClose
closes on backdrop click Clicking the overlay calls onClose
download button present Download button renders and has correct href

E2E Test (Sonnet)

File: e2e/document-viewer.spec.ts

  1. Upload a document (use existing upload flow).
  2. Navigate to the DocsPanel.
  3. Click the "View" button on the uploaded document.
  4. Verify the DocumentViewer modal opens.
  5. Verify the iframe/img contains a URL (presigned URL from R2).
  6. Close the viewer.
  7. Verify the viewer closes and the DocsPanel is visible again.

Edge Cases

Edge Case Scenario Handling Severity
Presigned URL expires while patient is viewing Patient opens the DocumentViewer, fetches the presigned URL (15-min TTL), then leaves the tab open for 20 minutes. When they scroll or interact with the iframe/img, the browser may re-request the resource and get a 403 from R2. The DocumentViewer component should track the expires_in value and set a timer. At T-60 seconds before expiry, silently re-fetch a fresh presigned URL via getDownloadUrl() and update the iframe/img src. If the re-fetch fails, show a non-blocking "Session expired — click to refresh" overlay rather than a broken image. Add a url_refreshed PostHog event for telemetry. Medium
R2 file deleted (GDPR erasure) but document record still exists A GDPR data subject request deletes the R2 file, but the document_references row still exists (marked is_deleted=True or the cascade missed it). The patient sees the document in DocsPanel and clicks "View". The presigned URL is generated but R2 returns 404 when the browser tries to load it. get_download_url() currently does not verify the R2 object exists before signing — presigned URLs are generated from the key, not a HEAD check. The DocumentViewer should handle the iframe/img onerror event: display "This document is no longer available" with a muted explanation ("It may have been removed at your request"). Do not expose the R2 404 details. Additionally, the GDPR deletion handler should mark document_references.is_deleted = True so the DocsPanel filters it out — verify this cascade exists. Medium
Document is TIFF or DICOM format Medical imaging files (TIFF, DICOM .dcm) cannot be rendered inline by browsers. The content_type is image/tiff or application/dicom. The DocumentViewer would show a blank iframe or broken image. In the DocumentViewer's content-type switch (Part B1 behavior item 3), add TIFF and DICOM to the "Other" branch — show a download-only view with the message "This file format cannot be previewed in the browser. Click Download to save it." For DICOM specifically, add a future-work note about integrating Cornerstone.js or OHIF Viewer for inline DICOM rendering (post-MVP). Store the content-type check list in a constant: INLINE_PREVIEWABLE = ['application/pdf', 'image/jpeg', 'image/png', 'image/webp', 'image/gif']. Low
Very large file (50MB+) A patient uploaded a full CT scan PDF or high-resolution imaging study. The presigned URL triggers a 50MB+ download. On slow connections, the iframe takes minutes to load. Memory usage spikes on mobile browsers. Add a file size check in get_download_url() — read doc.file_size_bytes from the document record. If > 20MB, return an additional size_warning: true field in the response. The DocumentViewer checks this flag and shows a "Large file (52 MB) — this may take a moment to load" banner before rendering. For files > 50MB, skip inline preview entirely and show download-only mode. The download button should use <a href={url} download> to trigger a browser download rather than loading into memory. Medium
Mobile with slow connection Patient opens the viewer on a 3G connection. The presigned URL fetch succeeds (small JSON response) but the actual R2 file download stalls. The loading skeleton shows indefinitely. Add a timeout to the iframe/img load. If the content hasn't loaded within 30 seconds, show a "Still loading — your connection may be slow" message with options: "Keep waiting" or "Download instead" (which opens the URL in a new tab, letting the browser's download manager handle it). On mobile viewports (< 768px), PDFs already open in a new tab (per Part B1) — this avoids the iframe memory issue entirely. Low
Concurrent download requests for the same document Two browser tabs (or a re-render) request presigned URLs for the same document simultaneously. Each call generates a separate presigned URL and logs a separate document.viewed event, inflating view counts. This is acceptable behavior — each presigned URL is independent and the audit trail correctly shows two access events. However, add client-side dedup: the DocumentViewer should cache the presigned URL in component state (or a short-lived React Query cache with 5-min TTL) so that closing and re-opening the same document within 5 minutes reuses the URL without a new API call. This also avoids unnecessary audit log entries. Low
content_type in DB doesn't match actual file in R2 The document_references.mime_type says application/pdf but the actual file in R2 is a JPEG (e.g., wrong MIME detected at upload time, or the file was corrupted). The DocumentViewer renders an iframe for a PDF, but the browser receives image bytes — the iframe shows garbage or nothing. The presigned URL includes ResponseContentType from the DB record (Part C). If this mismatches the actual file, the browser may handle it gracefully (Chrome often auto-detects) or show a blank. Add an onerror handler on the iframe: if the PDF iframe fails to load, retry as an <img> tag. If both fail, show the download-only fallback. Long-term fix: add a content-type verification step in the upload confirmation pipeline (compare declared MIME with magic bytes). Low
Dark mode — PDF iframe doesn't respect theme The DocumentViewer modal uses dark theme CSS variables (bg-chat-surface), but the PDF rendered inside the iframe has its own white background. The contrast between the dark modal chrome and bright white PDF is jarring. This is a browser limitation — PDF rendering inside iframes is controlled by the browser's PDF viewer, not by CSS. Mitigation: set the iframe container background to white (bg-white) regardless of theme, with a subtle border separating it from the dark modal chrome. For images, apply a subtle drop shadow on dark backgrounds so white-background images don't bleed into the modal edge. Do not attempt CSS filter: invert() on medical documents — color accuracy matters for imaging. Low

Implementation Checklist

Tier 1 — Opus

  • [ ] A2: get_download_url() service function — access control design, audit logging, presigned URL security review
  • [ ] B1: DocumentViewer.tsx component — modal architecture, zoom/pan behavior, mobile detection, error handling, PostHog events
  • [ ] Security review: Verify presigned URL cannot be used to access other tenants' documents. Verify CORS is correct. Verify Content-Type comes from metadata (not user input).

Tier 2 — Sonnet

  • [ ] A1: DocumentDownloadResponse schema — mechanical Pydantic model
  • [ ] A3: Download endpoint in router — follows existing endpoint patterns
  • [ ] A4: Event type comment update — one-line change
  • [ ] B2: DocsPanel "View" button — add icon + state + DocumentViewer render
  • [ ] B3: RequirementsChecklist view action — defer or implement based on scope
  • [ ] B4: getDownloadUrl() API function — mechanical API client call
  • [ ] C: R2 CORS verification — dashboard check + optional generate_download_url enhancement
  • [ ] Backend tests (test_document_download.py)
  • [ ] Frontend tests (DocumentViewer.test.tsx)
  • [ ] E2E test (document-viewer.spec.ts)

Security Considerations

Presigned URL Reuse

R2 presigned URLs are reusable within the expiry window. Anyone with the URL can download the file until it expires. This is acceptable for the same reason email attachments are acceptable -- the patient owns the document. The 15-minute window limits exposure.

CORS

The R2 bucket must have Access-Control-Allow-Origin set for the frontend domain (https://app.curaway.ai and http://localhost:5173 for dev). Without this, the browser will block iframe/img loads from the R2 domain. Verify and update the R2 CORS configuration as part of implementation.

Content-Disposition

  • Inline preview: Content-Disposition: inline -- the browser renders the file.
  • Download button: Content-Disposition: attachment; filename="report.pdf" -- the browser downloads the file.

The presigned URL controls this via the ResponseContentDisposition parameter. The default should be inline for the viewer. The download button can use a separate presigned URL with attachment disposition, or the frontend can use the download attribute on an <a> tag.

Content-Type Integrity

The Content-Type served by R2 must match the actual file type. Never trust user-provided Content-Type for the download URL. Read the mime_type from the DocumentReference record (set at upload time after validation) and pass it as ResponseContentType in the presigned URL parameters. This prevents a malicious upload from being served as text/html (which could enable XSS via the R2 domain).

Tenant Isolation

The get_document() function filters by tenant_id + patient_id + document_id + is_deleted == False. A patient in tenant-apollo-001 cannot generate a download URL for a document in tenant-other-002, even if they know the document_id. This is enforced at the database query level, not by URL signing.


Swagger / OpenAPI

The new endpoint will automatically appear in Swagger at /docs with the description and examples from the schema. Verify after implementation that the endpoint shows correctly with: - Path: GET /api/v1/patients/{patient_id}/documents/{document_id}/download - Tag: documents - Request: patient_id (path), document_id (path), X-Tenant-ID (header) - Response: APIResponse[DocumentDownloadResponse] - Error responses: 404 with STORAGE_DOCUMENT_NOT_FOUND_001


Post-Implementation Checklist

After implementation, verify:

  • [ ] Upload a PDF, then view it in the DocumentViewer. PDF renders in iframe.
  • [ ] Upload a JPEG, then view it. Image renders inline with zoom.
  • [ ] Check the events table for a document.viewed row after viewing.
  • [ ] Check the audit_logs table for a document.viewed row after viewing.
  • [ ] Try to view a document with a mismatched tenant_id. Expect 404.
  • [ ] Open the viewer on a mobile viewport (< 768px) with a PDF. Expect a new tab.
  • [ ] Verify dark mode renders correctly (no hardcoded white backgrounds).
  • [ ] Verify the presigned URL expires after 15 minutes (wait and try again -- should get a new one on re-click).
  • [ ] Update CLAUDE.md with the new endpoint and event type.
  • [ ] Update docs/architecture/05-documents.md (or equivalent) with the download flow.
  • [ ] Update the health page if it tracks document endpoints.