Spec Audit Gap Report¶

Date: 2026-04-06 Audited by: Claude Code (Session 32) Specs covered: 8 AI Steer docs + 15 SDD-MVP docs (23 total) Overall compliance: ~60% implemented, ~20% partial, ~12% diverged, ~8% missing

Priority 0 — Demo Blockers¶

1. No RTL rendering for Arabic (Steer-07)¶

What: Frontend has no dir="rtl", no locale utility, no Arabic template fallback
Where: curaway-health-navigator/src/pages/ConversationApp.tsx, missing src/utils/locale.ts
Impact: Aisha demo scenario is Arabic — investors see broken text
Effort: 1 session
Fix: Add isRTL() utility, dir="rtl" on Arabic messages, Arabic explanation templates

2. No extraction indicators in chat (Steer-06)¶

What: Patient has zero visibility that their medical data was captured from conversation
Where: Missing ExtractionIndicator.tsx component; backend doesn't include extracted entities in response metadata
Impact: Silent extraction feels broken — patients don't know system "heard" their info
Effort: 1 session
Fix: Include extracted entities in ChatResponse.metadata, render as teal badges below agent response

Priority 1 — High Impact¶

3. Security headers missing (SDD-12.9)¶

What: No CSP, HSTS, X-Frame-Options, X-Content-Type-Options
Where: app/main.py — no security headers middleware
Impact: Standard web security missing; XSS/clickjacking risk
Effort: 2 hours
Fix: Add Starlette SecurityHeadersMiddleware or manual header injection

What: Data subject deletion doesn't cascade to Neo4j, Qdrant, R2 files, or Redis cache
Where: app/services/data_subject_handler.py
Impact: GDPR Article 17 non-compliance for binary files and graph data
Effort: 2 days
Fix: Add Neo4j patient node deletion, Qdrant vector deletion, R2 file deletion via QStash, Redis key flush

5. PostgreSQL RLS not enforced (SDD-01.26)¶

What: Tenant isolation is app-layer WHERE clauses only, no DB-level Row Level Security policies
Where: Alembic migrations — no RLS policies exist
Impact: A bug in app-layer filtering could leak cross-tenant data
Effort: 2-3 days
Fix: Add RLS policies per table, set current_setting('app.tenant_id') per connection

What: consent_service.check_consent() exists but not called systematically before PHI access
Where: No middleware or decorator for consent gating
Impact: PHI could be accessed without active consent
Effort: 1-2 days
Fix: Add consent-check decorator or middleware on PHI-accessing endpoints

7. mypy not in CI pipeline (SDD-13.8)¶

What: Type checking not run pre-merge
Where: .github/workflows/ci.yml — mypy step missing
Impact: Type errors not caught before merge
Effort: 1 hour
Fix: Add python -m mypy app/ --ignore-missing-imports to CI

8. Event emissions from endpoints missing (Steer-02.4)¶

What: No patient_registered, consent_granted, chat_message_sent, match_completed events
Where: app/routers/patients.py, consent.py, cases.py, match.py
Impact: Events table incomplete — analytics, audit trail, and SSE all lack foundational events
Effort: 2 days
Fix: Add db.add(Event(...)) at each endpoint's success path

9. EHR Builder not a service class (Steer-01, SDD-05)¶

What: No merge rules, no conflict detection, no source priority. Function-based, not class-based.
Where: app/agents/ehr_builder_agent.py (function), not app/services/ehr_builder_service.py (class)
Impact: Multi-source data can overwrite silently; contradictory clinical data undetected
Effort: 3-5 days
Fix: Refactor into EHRBuilderService class with process_event(), merge_clinical_entities(), detect_conflicts()

10. No match progress SSE (Steer-05, SDD-14)¶

What: Matching is a black box — no stage-by-stage progress events
Where: app/services/match_service.py — no match_progress events emitted
Impact: User sees spinner during 8-10s matching with no feedback
Effort: 1-2 days
Fix: Emit progress events at each matching stage (Qdrant, Neo4j, scoring, reranking)

Priority 2 — Architectural Debt¶

11. Neo4j patient nodes not auto-synced from FHIR (Steer-04, SDD-05)¶

What: Patient graph nodes not updated when FHIR resources created
Where: app/services/graph_service.py — no sync_patient_conditions() call after FHIR write
Impact: Graph-enhanced matching can't use patient-specific clinical data
Effort: 2-3 days

12. QStash webhook signature verification missing (Steer-02.3)¶

What: Webhook endpoints accept unverified requests
Where: app/routers/internal.py — no QStash signature check
Impact: Security gap — anyone could call internal endpoints
Effort: 1-2 days

13. Matching weights hardcoded (SDD-06)¶

What: GraphEnhancedWeightedV1.WEIGHTS are class constants, not loaded from Flagsmith
Where: app/services/matching_engine.py
Impact: Can't A/B test different weight profiles without deploy
Effort: 1 day

14. No unified demo seed script (Steer-08)¶

What: Requires 4+ separate scripts to set up demo
Where: app/seed_demo.py, app/seed_providers.py, app/seed_procedure_tests.py, etc.
Impact: Demo setup is manual and error-prone
Effort: 1 day

15. No BDD test layer (SDD-13.4)¶

What: No behave configuration, no .feature files
Impact: Missing behavior-driven tests for happy paths + GDPR flows
Effort: 2-3 days

16. No Event Type Registry (Steer-02.1, SDD-02)¶

What: Events are free-form strings, no schema validation, no EventType enum
Where: Events written ad-hoc across codebase
Impact: Fragile to event schema drift
Effort: 2-3 days

17. No FHIR confidence field (SDD-03.4)¶

What: No confidence FLOAT on fhir_resources table
Impact: Can't implement spec'd merge rules (confidence delta > 0.3 flagging)
Effort: 0.5 days

18. Shadow mode not executing (SDD-06.14)¶

What: is_shadow field exists on MatchResult but no code runs dual strategies
Impact: Can't safely evaluate new matching strategies
Effort: 1-2 days

19. Clinical Context Agent no QStash retry on failure (SDD-04.6)¶

What: Failed extractions logged but not queued for retry
Impact: Transient LLM failures permanently lose extraction
Effort: 1 day

20. Rate limiting only on public routes (SDD-12.11)¶

What: Authenticated endpoints unprotected from abuse
Where: app/middleware/rate_limiter.py — scoped to /api/v1/public/ only
Impact: No DoS protection on authenticated endpoints
Effort: 0.5 days

Priority 3 — Nice-to-Have¶

#	Gap	Source	Effort
21	`agent_name` missing from message model	SDD-08	1-2h
22	No `GET /patients/{id}/matches` endpoint	SDD-10	4h
23	No WCAG formal audit	SDD-11	1 day
24	No snapshot tests (API schema contracts)	SDD-13	4h
25	Intake progress not weighted per spec categories	Steer-01	2 days
26	No `useEventStream` reusable hook (frontend)	SDD-11	2h
27	Prettier not in frontend devDependencies	SDD-13	1h
28	Consent expiry warning logic deferred	SDD-09	4h
29	No E2E demo journey Playwright test	Steer-08	1-2 days
30	Unified patient events SSE endpoint	Steer-02	2-3 days

Architectural Divergences (Working Differently Than Spec)¶

These are not bugs — the system works — but the implementation shape differs from the spec's vision.

Spec Vision	Actual Implementation	Risk Level
4-stage matching pipeline (Qdrant→Neo4j→PG→LLM)	Flat single service call	Low for demo
Event-driven EHR Builder class with `process_event()`	Imperative function calls	Low for demo
Typed event registry with Pydantic models	Ad-hoc string event types	Medium — drift risk
Unified patient SSE endpoint	3 purpose-specific SSE endpoints	Low — works fine
Clerk JWT org-based RBAC	Header-based tenant ID	Medium — spoofable
LangGraph intake intent classification	Case orchestrator phase routing	Low — equivalent

What's Solid (No Gaps)¶

Technology stack (25 services, 21 free tiers)
Modular monolith with 8 isolated domains
4-agent LangGraph pipeline with deterministic fallbacks
Triple-tier OCR (PyMuPDF → Unstructured → Claude Vision)
FHIR R4 validation and storage
Neo4j knowledge graph (42 providers, OFFERS/REQUIRES_TEST)
Qdrant semantic search (42 provider embeddings)
43+ feature flags with YAML fallbacks
Guardrails (3 layers: classifier, output validator, file validator)
Correlation IDs + idempotency middleware
MCP server (6 tools)
Langfuse tracing on all agent calls
760 backend tests, 0 failures
SSE document upload progress (6 steps)
LLM response streaming with Redis SSE
Chat pipeline caching + parallel + deferred extraction
Inline OCR fast path
143-parameter document matching
Configurable gating thresholds

Recommended Fix Order¶

For investor demo (1-2 sessions): 1. Security headers (2h) 2. RTL rendering + Arabic templates (1 session) 3. Extraction indicators in chat (0.5 session) 4. Unified demo seed script (0.5 day)

For production readiness (3-5 sessions): 5. GDPR deletion cascade (Neo4j, Qdrant, R2, Redis) 6. Consent middleware enforcement 7. PostgreSQL RLS policies 8. mypy in CI 9. QStash webhook signature verification 10. Event emissions from all endpoints

For architectural alignment (post-seed): 11. EHR Builder service class with merge rules 12. 4-stage matching pipeline abstraction 13. Typed event registry 14. Neo4j patient node auto-sync 15. Shadow mode for matching strategies