Skip to content

Spec Audit Gap Report

Date: 2026-04-06 Audited by: Claude Code (Session 32) Specs covered: 8 AI Steer docs + 15 SDD-MVP docs (23 total) Overall compliance: ~60% implemented, ~20% partial, ~12% diverged, ~8% missing


Priority 0 — Demo Blockers

1. No RTL rendering for Arabic (Steer-07)

  • What: Frontend has no dir="rtl", no locale utility, no Arabic template fallback
  • Where: curaway-health-navigator/src/pages/ConversationApp.tsx, missing src/utils/locale.ts
  • Impact: Aisha demo scenario is Arabic — investors see broken text
  • Effort: 1 session
  • Fix: Add isRTL() utility, dir="rtl" on Arabic messages, Arabic explanation templates

2. No extraction indicators in chat (Steer-06)

  • What: Patient has zero visibility that their medical data was captured from conversation
  • Where: Missing ExtractionIndicator.tsx component; backend doesn't include extracted entities in response metadata
  • Impact: Silent extraction feels broken — patients don't know system "heard" their info
  • Effort: 1 session
  • Fix: Include extracted entities in ChatResponse.metadata, render as teal badges below agent response

Priority 1 — High Impact

3. Security headers missing (SDD-12.9)

  • What: No CSP, HSTS, X-Frame-Options, X-Content-Type-Options
  • Where: app/main.py — no security headers middleware
  • Impact: Standard web security missing; XSS/clickjacking risk
  • Effort: 2 hours
  • Fix: Add Starlette SecurityHeadersMiddleware or manual header injection

4. GDPR deletion incomplete (SDD-09.10-13)

  • What: Data subject deletion doesn't cascade to Neo4j, Qdrant, R2 files, or Redis cache
  • Where: app/services/data_subject_handler.py
  • Impact: GDPR Article 17 non-compliance for binary files and graph data
  • Effort: 2 days
  • Fix: Add Neo4j patient node deletion, Qdrant vector deletion, R2 file deletion via QStash, Redis key flush

5. PostgreSQL RLS not enforced (SDD-01.26)

  • What: Tenant isolation is app-layer WHERE clauses only, no DB-level Row Level Security policies
  • Where: Alembic migrations — no RLS policies exist
  • Impact: A bug in app-layer filtering could leak cross-tenant data
  • Effort: 2-3 days
  • Fix: Add RLS policies per table, set current_setting('app.tenant_id') per connection
  • What: consent_service.check_consent() exists but not called systematically before PHI access
  • Where: No middleware or decorator for consent gating
  • Impact: PHI could be accessed without active consent
  • Effort: 1-2 days
  • Fix: Add consent-check decorator or middleware on PHI-accessing endpoints

7. mypy not in CI pipeline (SDD-13.8)

  • What: Type checking not run pre-merge
  • Where: .github/workflows/ci.yml — mypy step missing
  • Impact: Type errors not caught before merge
  • Effort: 1 hour
  • Fix: Add python -m mypy app/ --ignore-missing-imports to CI

8. Event emissions from endpoints missing (Steer-02.4)

  • What: No patient_registered, consent_granted, chat_message_sent, match_completed events
  • Where: app/routers/patients.py, consent.py, cases.py, match.py
  • Impact: Events table incomplete — analytics, audit trail, and SSE all lack foundational events
  • Effort: 2 days
  • Fix: Add db.add(Event(...)) at each endpoint's success path

9. EHR Builder not a service class (Steer-01, SDD-05)

  • What: No merge rules, no conflict detection, no source priority. Function-based, not class-based.
  • Where: app/agents/ehr_builder_agent.py (function), not app/services/ehr_builder_service.py (class)
  • Impact: Multi-source data can overwrite silently; contradictory clinical data undetected
  • Effort: 3-5 days
  • Fix: Refactor into EHRBuilderService class with process_event(), merge_clinical_entities(), detect_conflicts()

10. No match progress SSE (Steer-05, SDD-14)

  • What: Matching is a black box — no stage-by-stage progress events
  • Where: app/services/match_service.py — no match_progress events emitted
  • Impact: User sees spinner during 8-10s matching with no feedback
  • Effort: 1-2 days
  • Fix: Emit progress events at each matching stage (Qdrant, Neo4j, scoring, reranking)

Priority 2 — Architectural Debt

11. Neo4j patient nodes not auto-synced from FHIR (Steer-04, SDD-05)

  • What: Patient graph nodes not updated when FHIR resources created
  • Where: app/services/graph_service.py — no sync_patient_conditions() call after FHIR write
  • Impact: Graph-enhanced matching can't use patient-specific clinical data
  • Effort: 2-3 days

12. QStash webhook signature verification missing (Steer-02.3)

  • What: Webhook endpoints accept unverified requests
  • Where: app/routers/internal.py — no QStash signature check
  • Impact: Security gap — anyone could call internal endpoints
  • Effort: 1-2 days

13. Matching weights hardcoded (SDD-06)

  • What: GraphEnhancedWeightedV1.WEIGHTS are class constants, not loaded from Flagsmith
  • Where: app/services/matching_engine.py
  • Impact: Can't A/B test different weight profiles without deploy
  • Effort: 1 day

14. No unified demo seed script (Steer-08)

  • What: Requires 4+ separate scripts to set up demo
  • Where: app/seed_demo.py, app/seed_providers.py, app/seed_procedure_tests.py, etc.
  • Impact: Demo setup is manual and error-prone
  • Effort: 1 day

15. No BDD test layer (SDD-13.4)

  • What: No behave configuration, no .feature files
  • Impact: Missing behavior-driven tests for happy paths + GDPR flows
  • Effort: 2-3 days

16. No Event Type Registry (Steer-02.1, SDD-02)

  • What: Events are free-form strings, no schema validation, no EventType enum
  • Where: Events written ad-hoc across codebase
  • Impact: Fragile to event schema drift
  • Effort: 2-3 days

17. No FHIR confidence field (SDD-03.4)

  • What: No confidence FLOAT on fhir_resources table
  • Impact: Can't implement spec'd merge rules (confidence delta > 0.3 flagging)
  • Effort: 0.5 days

18. Shadow mode not executing (SDD-06.14)

  • What: is_shadow field exists on MatchResult but no code runs dual strategies
  • Impact: Can't safely evaluate new matching strategies
  • Effort: 1-2 days

19. Clinical Context Agent no QStash retry on failure (SDD-04.6)

  • What: Failed extractions logged but not queued for retry
  • Impact: Transient LLM failures permanently lose extraction
  • Effort: 1 day

20. Rate limiting only on public routes (SDD-12.11)

  • What: Authenticated endpoints unprotected from abuse
  • Where: app/middleware/rate_limiter.py — scoped to /api/v1/public/ only
  • Impact: No DoS protection on authenticated endpoints
  • Effort: 0.5 days

Priority 3 — Nice-to-Have

# Gap Source Effort
21 agent_name missing from message model SDD-08 1-2h
22 No GET /patients/{id}/matches endpoint SDD-10 4h
23 No WCAG formal audit SDD-11 1 day
24 No snapshot tests (API schema contracts) SDD-13 4h
25 Intake progress not weighted per spec categories Steer-01 2 days
26 No useEventStream reusable hook (frontend) SDD-11 2h
27 Prettier not in frontend devDependencies SDD-13 1h
28 Consent expiry warning logic deferred SDD-09 4h
29 No E2E demo journey Playwright test Steer-08 1-2 days
30 Unified patient events SSE endpoint Steer-02 2-3 days

Architectural Divergences (Working Differently Than Spec)

These are not bugs — the system works — but the implementation shape differs from the spec's vision.

Spec Vision Actual Implementation Risk Level
4-stage matching pipeline (Qdrant→Neo4j→PG→LLM) Flat single service call Low for demo
Event-driven EHR Builder class with process_event() Imperative function calls Low for demo
Typed event registry with Pydantic models Ad-hoc string event types Medium — drift risk
Unified patient SSE endpoint 3 purpose-specific SSE endpoints Low — works fine
Clerk JWT org-based RBAC Header-based tenant ID Medium — spoofable
LangGraph intake intent classification Case orchestrator phase routing Low — equivalent

What's Solid (No Gaps)

  • Technology stack (25 services, 21 free tiers)
  • Modular monolith with 8 isolated domains
  • 4-agent LangGraph pipeline with deterministic fallbacks
  • Triple-tier OCR (PyMuPDF → Unstructured → Claude Vision)
  • FHIR R4 validation and storage
  • Neo4j knowledge graph (42 providers, OFFERS/REQUIRES_TEST)
  • Qdrant semantic search (42 provider embeddings)
  • 43+ feature flags with YAML fallbacks
  • Guardrails (3 layers: classifier, output validator, file validator)
  • Correlation IDs + idempotency middleware
  • MCP server (6 tools)
  • Langfuse tracing on all agent calls
  • 760 backend tests, 0 failures
  • SSE document upload progress (6 steps)
  • LLM response streaming with Redis SSE
  • Chat pipeline caching + parallel + deferred extraction
  • Inline OCR fast path
  • 143-parameter document matching
  • Configurable gating thresholds

For investor demo (1-2 sessions): 1. Security headers (2h) 2. RTL rendering + Arabic templates (1 session) 3. Extraction indicators in chat (0.5 session) 4. Unified demo seed script (0.5 day)

For production readiness (3-5 sessions): 5. GDPR deletion cascade (Neo4j, Qdrant, R2, Redis) 6. Consent middleware enforcement 7. PostgreSQL RLS policies 8. mypy in CI 9. QStash webhook signature verification 10. Event emissions from all endpoints

For architectural alignment (post-seed): 11. EHR Builder service class with merge rules 12. 4-stage matching pipeline abstraction 13. Typed event registry 14. Neo4j patient node auto-sync 15. Shadow mode for matching strategies