Skip to content

Real-Time Messaging Roadmap

Current State (Phase 1: Upstash Redis)

Deployed: Upstash Redis REST API with list-based pub/sub (RPUSH/LPOP)

Pattern:

QStash OCR callback → RPUSH to Redis list → SSE endpoint LPOPs → EventSource in browser

What's cached: | Key Pattern | TTL | Data | Queries Saved | |---|---|---|---| | proc_reqs:{code} | 1 hour | Procedure requirements (static seed data) | 2-4 per chat message | | conv_case:{case_id} | 10 min | Conversation ID for a case (immutable) | 1 per chat message | | doc_checklist:{case_id} | 30 sec | Document checklist (changes on upload) | 3-4 per poll | | health:data | 30 sec | Full health dashboard data | 11 queries + 3 HTTP calls |

Limitations: - SSE delivery latency: ~2 seconds (LPOP poll interval) - Not true push — Upstash REST doesn't support persistent subscriptions - Free tier: 10K commands/day (~5 concurrent users continuously) - No message persistence beyond 10-min TTL on lists


Phase 2: Add Ably for Browser Push (Post-Seed, ~$0-25/month)

When to implement: When the 2-second SSE latency feels too slow for the demo/product, or when you need presence indicators ("Dr. Patel is reviewing your case").

What changes: - Backend publishes to Ably REST API instead of Redis lists (for browser-facing events) - Frontend uses Ably JS SDK (ably-js) with WebSocket — <100ms delivery - Keep Redis for server-side caching (proc_reqs, checklist, health)

Architecture:

QStash OCR callback → insert message in DB
                    → publish to Ably channel "case:{case_id}"
Frontend: Ably SDK subscribes to "case:{case_id}"
         → receives message via WebSocket in <100ms
         → appends to conversation

Implementation:

# Backend: app/integrations/ably_client.py
import httpx

async def publish_to_case(case_id: str, message: dict):
    """Publish a message to an Ably channel."""
    async with httpx.AsyncClient() as client:
        await client.post(
            f"https://rest.ably.io/channels/case:{case_id}/messages",
            json={"name": "new_message", "data": json.dumps(message)},
            auth=(ABLY_API_KEY, ""),
        )

// Frontend: src/hooks/useRealtimeMessages.ts
import Ably from 'ably';

export function useRealtimeMessages(caseId: string) {
  useEffect(() => {
    const ably = new Ably.Realtime(ABLY_CLIENT_KEY);
    const channel = ably.channels.get(`case:${caseId}`);

    channel.subscribe('new_message', (msg) => {
      const data = JSON.parse(msg.data);
      setMessages(prev => [...prev, data]);
    });

    return () => { channel.unsubscribe(); ably.close(); };
  }, [caseId]);
}

Cost: - Free: 6M messages/month, 200 concurrent connections - Pro ($29/mo): 50M messages, 10K connections - At 1K cases/month with ~20 messages each: ~20K messages/month (well within free tier)

Pros: - True real-time (<100ms vs 2s) - WebSocket (more efficient than SSE polling) - Presence API (show who's viewing the case) - Message history (replay missed messages on reconnect) - SDK handles reconnection, offline buffering

Cons: - Third-party dependency for critical path - Client-side API key management (token auth recommended) - Another vendor to monitor


Phase 3: Kafka for Event Sourcing (Post Series-A)

When to implement: Only when ALL of these are true: 1. You've split the monolith into microservices 2. Multiple services need to consume the same events 3. You need a durable audit trail of every state change 4. You process >100K events/day

What it solves that Redis/Ably don't: - Event sourcing: Complete replay of every case state change - Multi-consumer: Analytics, compliance reporting, billing, notifications all reading the same event stream independently - Ordering guarantees: Per-partition ordering for case events - Retention: Keep events for days/weeks/months (configurable) - Schema evolution: Avro/Protobuf schemas with compatibility checks

Architecture:

Any service → Kafka topic "case.events"
                ↓ Consumer Group: "analytics"
                → Write to analytics warehouse
                ↓ Consumer Group: "notifications"
                → Send email/SMS notifications
                ↓ Consumer Group: "compliance"
                → Write to audit log
                ↓ Consumer Group: "realtime"
                → Publish to Ably for browser push

Topics: | Topic | Events | Consumers | |---|---|---| | case.events | case_created, status_changed, provider_selected, consent_given, forwarded | Analytics, Compliance, Notifications | | document.events | uploaded, ocr_started, ocr_completed, analysis_completed, validated | EHR Builder, Document Service, Analytics | | match.events | match_requested, match_completed, provider_selected | Analytics, Provider Dashboard | | patient.events | registered, profile_updated, consent_granted | CRM, Notifications |

Provider: Confluent Cloud (managed Kafka) - Basic: $0.11/GB, ~$50-100/mo minimum - Standard: dedicated cluster, ~$300/mo - Enterprise: HIPAA BAA available

Pros: - Industry standard for event-driven architecture - Unlimited replay (within retention period) - Decouples services completely - Excellent tooling (Schema Registry, ksqlDB, Connect)

Cons: - Minimum $50-100/mo even for low volume - Needs persistent consumer processes (not serverless-friendly) - Operational complexity (partitions, consumer groups, offsets, rebalancing) - Schema management overhead - Overkill for <10K events/day


Decision Matrix

Scale Recommendation Monthly Cost Latency Complexity
MVP / Demo Upstash Redis (current) $0 ~2s Low
Seed / Early Users (100-1K cases/mo) Redis + Ably $0-29 <100ms Low
Growth (1K-10K cases/mo) Redis + Ably + consider Kafka $29-150 <100ms Medium
Scale (10K+ cases/mo, microservices) Redis + Ably + Kafka $200-500 <100ms High

Migration Path

Each phase is additive — you don't rip out the previous layer:

  1. Redis stays as the caching layer regardless of messaging choice
  2. Ably replaces the SSE endpoint for browser push (Redis lists become unnecessary for messaging, but keep for caching)
  3. Kafka adds event sourcing underneath — services publish to Kafka, a consumer publishes to Ably for browser delivery

The current Redis caching (proc_reqs, conversation, checklist, health) remains valuable at all scales.