Skip to content

Governance Framework Alignment Audit

Audit date: 2026-05-20 Source framework: Curaway.AI — Unified Governance, Operating Framework, Architecture and Guardrail Model (Dr. Shrikanth Naidu + Praveen Bharadwaj, dated 2026-05-19) Platform commits audited: curaway_src@245e669, curaway-health-navigator@main Auditor: Engineering (read-only review of code + ADRs + tests against framework requirements)

Legend

Symbol Meaning
Aligned — explicit code/test enforcement exists
⚠️ Partial — concept implemented but with gaps
Gap — framework requirement not implemented
📋 Organizational / policy artifact — outside code scope
🔒 Built-in safety control (DB constraint, middleware, etc.) — strongest possible enforcement
✅🔒 Aligned AND backed by a built-in safety control

A. Foundational positioning

# Framework requirement Status Evidence / Gap
A.1 Curaway.AI positions as care coordination + patient navigation platform, NOT a medical provider CLAUDE.md ground rule #9 (Single Source of Truth for Brand Voice); config/voice_rules.yaml blocks medical-authority claims
A.2 "Does not practice medicine / diagnose / treat / guarantee outcomes" tests/test_no_medical_advice.py — imperative-verb scanner enforces this in patient-facing code
A.3 Operating doctrine ("Curaway coordinates. Providers treat. Doctors decide. Patients choose. AI supports. Humans validate. Audit trails defend.") 📋 Not codified anywhere. Belongs in docs/architecture/ or new docs/governance/operating-doctrine.md
A.4 Jurisdiction-specific regulatory review (India / GCC / EU / UK / USA) No docs/regulatory/<jurisdiction>.md files. Whole framework recommends periodic regulatory review — no formal artifact today

B. Responsibility & Liability Boundary

# Framework requirement Status Evidence / Gap
B.1 Curaway.AI owns: onboarding, intake, consent, coordination, provider discovery, records routing, scheduling, case mgmt, concierge, audit All owned domains have modules: patient_routes.py, consent.py, documents.py, matching_engine.py, teleconsultation_service.py, audit.py
B.2 Curaway.AI does NOT own: diagnosis, surgical decisions, prescriptions, anesthesia, hospital/surgeon negligence, complications, outcomes Voice rules + medical-advice scanner enforce this in language. clinical_extraction.py quarantines low-confidence findings rather than asserting them
B.3 Liability shield: transparent attribution, provider-wise accountability, documented consent, itemized fees, physician-reviewed workflows, AI explainability logs, human override, audit trails ⚠️ All present except explainability logs for AI decisions (no dedicated explainability store; ad-hoc trace data in Langfuse only) and human override interface (no UI surface yet)
B.4 "No liability without control. No clinical claim without provider accountability. No AI recommendation without traceability." Every llm_gateway.invoke() stamps prompt_version, case_id, patient_id. Provider attribution lives in matching_result.scored_providers

C. Platform Architecture & Modules

C.1 Intake Module

Requirement Status Evidence
Collect data only after consent Consent gate middleware app/middleware/consent_gate.py:24-41
State platform role to patient ⚠️ Disclaimer exists in voice_rules.yaml but no enforced opening-message rule in v6 prompt
Capture history, comorbidities, prior tx, meds, allergies, mobility FHIR Patient + Condition + MedicationStatement + Procedure resources, plus medical_history JSONB
Ask for missing records app/agents/conversation_v6 + patient_intake_completeness (gated on Naidu's SOP content)
Avoid diagnosis / eligibility promises tests/test_no_medical_advice.py PATIENT_FACING_FILES list
Flag incomplete records medical_history.missing_information JSONB array (refreshed on every EHR rebuild — known bug #1026)
Generate structured case summaries _build_patient_context() + _render_findings_to_verify() (shipped via #1023)
Identify high-risk indicators + trigger escalation ⚠️ Risk assessor exists (ADR-0013, ~30 rules) but escalation is Phase 3 stub (app/services/escalation_service.py)
Timestamped AI summaries llm_usage table tracks every generation
Capture explicit consent for provider data sharing Purpose-scoped consent: medical_matching, clinical_data_processing
Must NOT: declare diagnosis / recommend surgery / hide risks / auto-enroll All enforced by voice rules + medical-advice scanner

C.2 Clinical Intelligence Layer (CIL)

Requirement Status Evidence
Position as decision support only, label outputs ⚠️ Position respected in code, but no FE label "AI-assisted decision support output" prefix on cards yet
Timestamped AI outputs + confidence levels ⚠️ Timestamps yes (Langfuse); confidence scoring only on extracted conditions (low_confidence quarantine via #1023), NOT on care-pathway suggestions
Strict algorithm + prompt version control Flagsmith prompt_version flag; v6 architecture (CLAUDE.md "Conversation prompt architecture")
Trigger escalation for high-risk patients ⚠️ Risk rules exist; escalation Phase 3 stub
Refuse to generate decisions when data incomplete Conversation agent asks for missing records before proceeding
Auditability of every recommendation llm_usage + audit_logs tables
Separate AI observations from provider-approved decisions ⚠️ Conceptually separate but no UI affordance distinguishing "AI says X" from "Doctor approved X"
Must NOT: autonomous treatment plans / surgical decisions / hide risks / replace clinical judgment Voice + medical-advice scanners enforce

C.3 AI Learning Governance

Requirement Status Evidence
Anonymization for model improvement No model fine-tuning today; if/when adopted, no de-identification pipeline exists
Human review before production deployment ⚠️ New prompts gate on LLM grader fixture corpus (#1043), but no formal "Medical Governance Committee sign-off" workflow before flag flip
Version control for all model updates prompt_version Flagsmith flag, model registry YAML
Monitor model drift + bias No drift monitoring; no bias evaluation
Periodic AI validation reviews No quarterly review process documented
Rollback capability prompt_arch flag flips back to v4
Governance approval before retraining 📋 Not applicable yet; will be when fine-tuning happens
Explainability preserved after upgrades ⚠️ Only Langfuse traces; no dedicated explainability log

C.4 Provider Module

Requirement Status Evidence
Maintain provider contracts ⚠️ providers table has metadata fields but no contract document store / version tracker
Provider credentials + licenses Doctor.license_number, license_expiry; Neo4j Doctor nodes
Periodic license validation No automated job to check expiry; manual today
Track malpractice insurance Field doesn't exist on provider/doctor models
Quality + performance metrics ⚠️ Scoring inputs exist (provider_success_rate, provider_annual_cases) but no scorecard UI
Provider scorecard ⚠️ Data points captured; no formal scorecard schema
Refund policies per-provider No per-provider refund policy field
Emergency escalation protocols No structured per-provider escalation contact stored
Suspension protocols No automated suspension flow
Must NOT: onboard unverified, hide poor outcomes, continue expired licenses ⚠️ Manual onboarding today; no automated guard for expired licenses

C.5 Concierge Module

Requirement Status Evidence
Itemize service components Commerce module: line-item invoices, separate platform vs medical fees
Avoid all-inclusive success language Voice rules block "all-inclusive", "guaranteed"
Preserve patient choice / multiple options Matching returns ranked list, patient selects
Route clinical questions to providers Conversation prompt v6 includes routing rule
Escalate emergencies ⚠️ No automated emergency-keyword detection in chat
Requirement Status Evidence
Digital informed consent before onboarding consent_gate middleware blocks pre-consent endpoints
Consent for AI-assisted coordination ⚠️ Generic data-processing consent exists, no dedicated ai_coordination purpose
Provider-specific consent Single global medical_matching purpose, no per-provider consent capture
Provider-specific surgical/high-risk consent Not implemented (would require pre-op form workflow)
Multilingual consent ⚠️ Voice rules support locale but consent forms English-only
Versioning + immutability Consent records append-only (app/models/consent.py)
Withdrawal workflows ⚠️ API endpoint exists; FE has no consent-management UI
Audit trail of consent events Every grant/revoke writes to audit_logs + Event
Must NOT: assume consent / modify signed / share without consent Architecturally enforced by middleware

C.7 Payment & Financial Liability

Requirement Status Evidence
Separate platform fees from medical fees Commerce module shipped 2026-05-13 (commerce module feature spec); intent + invoice schemas distinguish source_domain
Provider-wise billing Commission table; per-provider commission schedule (#1028 backfill ran)
Share provider invoices with patients ⚠️ Invoice generation exists; patient-facing display TBD
Escrow / split payment Not implemented (Stripe Connect placeholder + Razorpay Route placeholder in app/config.py:158)
AML/KYC for international payments No KYC flow today
Dynamic estimate architecture ⚠️ Estimate generation exists; dynamic repricing on clinical change not wired
Must NOT: mix platform/medical fees, hide refund, bundle clinical w/o accountability Commerce schema enforces separation; refund policy field exists per intent

C.8 Rehab / Post-Op / Transition Care

Requirement Status Evidence
Separate agreements for rehab providers ⚠️ provider_type enum includes rehab; no contract repository
Compliance monitoring No wearable integration, no session-check workflow
Wearable device disclaimers 📋 Approved language in framework, not codified
Emergency escalation matrices Not implemented
LAMA documentation No LAMA flag on case model
Missed-session notifications Not implemented
Surgeon clearance before rehab Workflow not built
Must NOT: guarantee mobility outcomes Voice rules block

C.9 Communication & Marketing Governance

Requirement Status Evidence
Approved patient-facing language config/voice_rules.yaml; CI test
Block "miracle / magical / guaranteed cure / 100% success / risk-free / fixed outcome / AI is more accurate than doctors" All listed phrases in voice_rules.yaml blocklist
Approved positioning ("AI-enabled care coordination", "provider-led", etc.) Same source
Testimonials only with consent 📋 No testimonial module today
Periodic ad-content review 📋 Process question, not code

D. Human-in-the-Loop Operating Model

# Requirement Status Evidence / Gap
D.1 Defined HITL actors (Care Coordinator, Clinical Coordinator, Doctor, Medical Board, Legal/Compliance Officer, Finance, Provider Manager, CXO, Patient) ⚠️ Role enum exists (RoleCode) but Medical Board / Compliance Officer / Provider Manager roles missing
D.2 Human Override Principle on every AI recommendation ⚠️ Conceptual: doctor can reject, override interface for AI outputs (e.g. "AI says X, override to Y") is NOT a built feature
D.3 Escalation triggers (incomplete data, high-risk symptoms, comorbidities, AI confidence below threshold, missed rehab, emergency indicators, patient asks for diagnosis/guarantee) Risk rules exist but escalation routing is Phase 3 stub. High priority gap.

E. AI Agent Behaviour Framework

# Requirement Status Evidence
E.1 Universal AI Agent Rules (18 rules) Encoded in v6 prompt architecture + voice rules + medical-advice scanner
E.2 Response pattern (acknowledge → clarify role → coordinate → identify if review needed → ask for missing info → next step → escalate if risk → document) ⚠️ Pattern is described in v6 prompt; multi-question violation in 4/5 turns flagged in #1025 — current behavior diverges from one-question-per-turn
E.3 Refusal pattern when asked for diagnosis/treatment decision/guarantee Built into v6 conversation prompt; tested via fixture corpus

F. Dynamic Estimate & Repricing

# Requirement Status
F.1 Estimate structure (core + provider fees + platform fee + concierge + rehab + travel + variable risk + repricing clause + refund + exclusions) ⚠️ Estimate exists in commerce module; variable risk advisory + dynamic repricing clauses not wired into estimate output
F.2 Variable risk advisory disclosure ❌ Not in current estimate template
F.3 Dynamic repricing right reserved on clinical change ❌ Not wired

G. Provider Accountability

# Requirement Status
G.1 Provider Independence Principle (independent licensee, own malpractice coverage) ⚠️ Architectural; no enforcement field on provider model
G.2 Provider agreement clauses (jurisdiction, arbitration, liability cap, malpractice insurance, indemnity, suspension, etc.) 📋 Legal artifact; no contract repository in platform
G.3 Provider scorecard (license, malpractice, outcomes, feedback, complications, response time, etc.) ⚠️ Data points exist; UI scorecard TBD
G.4 Conflict of Interest / Provider Neutrality (transparent ranking, disclose sponsored, no commercial-only prioritization) ⚠️ Scoring is transparent in code (147-param registry, PR-A #767); disclosure to patient not implemented

H. Telemedicine / Remote Consultation

# Requirement Status Evidence
H.1 Curaway is NOT a telemedicine provider unless licensed Per framework + ADR-0018 Phase 2a
H.2 Provider attribution in tele-sessions teleconsultation_sessions.provider_id
H.3 Patient consent for tele-session ⚠️ Generic consent; no telemedicine-specific consent
H.4 Audit logs for tele-session Session lifecycle audited
H.5 Recording disabled per ADR-0025 ✅🔒 DB-level CHECK (recording_url IS NULL) constraint on teleconsultation_sessions — strongest possible control
H.6 Jurisdiction compliance 📋 Per-jurisdiction tele-consult law not codified

I. Patient Autonomy

# Requirement Status
I.1 Right to understand platform role ⚠️ Voice rules support, no enforced first-turn disclosure
I.2 Right to know providers are independent ⚠️ Not surfaced in UI today
I.3 Right to choose among multiple providers ✅ Matching returns ranked list; FE supports selection
I.4 Right to decline bundled services ⚠️ Commerce module supports per-item, FE flow unclear
I.5 Right to withdraw from pathway ✅ Consent revoke + case archive
I.6 Access to provider-wise cost ⚠️ Backend has it; FE display TBD
I.7 Receive doctor comments as-is ❌ Doctor comments path not yet built

J. Emergency & Crisis Boundary

# Requirement Status
J.1 Curaway is not emergency responder / ICU / ambulance ✅ Documented
J.2 Auto-detect emergency indicators in chat → route ❌ No emergency-keyword classifier today
J.3 Maintain geo-specific emergency guidance ❌ Not implemented

K. Data Privacy & Cybersecurity

# Requirement Status Evidence
K.1 End-to-end encryption TLS in transit; Fernet field-level encryption (app/services/encryption.py) for PII fields
K.2 RBAC enforced app/middleware/rbac_middleware.py, role enum, permission codes
K.3 Event-wise data access logs audit_logs table, append-only
K.4 Breach notification workflow ⚠️ No documented breach-notification runbook; Telegram alerts can act as relay
K.5 Data anonymization for analytics PostHog uses patient_id UUID only (no PII) — analytics.ts pattern
K.6 Periodic penetration testing 📋 Process; not codified
K.7 Cybersecurity insurance 📋 Org artifact
K.8 Data residency ⚠️ Railway PG single-region (likely US); no per-jurisdiction data placement
K.9 Minimum necessary access principle RBAC + per-resource gates (require_case_access, require_patient_access)

L. Business Continuity & DR

# Requirement Status
L.1 Disaster recovery plan ❌ No documented DR plan
L.2 Backup infrastructure ⚠️ Railway PG auto-backups (no documented retention/restore drill)
L.3 Cybersecurity incident workflow ❌ Not documented
L.4 Failover capability ⚠️ Neo4j Aura projection from PG provides eventual consistency; no PG failover region
L.5 Emergency communication plan ❌ Not documented
L.6 Periodic recovery testing ❌ No drill cadence

M. Third-Party Vendor Governance

# Requirement Status
M.1 Vendor inventory ⚠️ Documented in docs/reference/technology-stack.md, not formal inventory
M.2 DPA / BAA tracking ❌ No artifact
M.3 Vendor due diligence 📋 Process not codified
M.4 Breach notification clauses 📋 In individual contracts, not centralized
M.5 Periodic vendor security review ❌ Not scheduled

N. Data Lifecycle Governance

# Requirement Status
N.1 Jurisdiction-specific retention schedules ❌ No retention policy table per data type
N.2 Secure deletion workflows ✅ GDPR Article 17 cascade-delete (tests/test_gdpr_deletion.py, 16 stores)
N.3 Legal hold procedures ❌ No legal-hold mechanism
N.4 Audit trail retention ⚠️ Audit logs immutable; no documented retention horizon (default = forever)
N.5 Prevention of indefinite unnecessary medical data retention ❌ No TTL on patient data; relies on user-initiated erasure

# Requirement Status Evidence
O.1 27 specific events logged (onboarding, consent, AI output, AI confidence, prompt version, human review, doctor review, provider selection, estimate generation, risk disclosure, payment, escalation, etc.) ⚠️ Most are logged via audit_logs + Event + llm_usage tables. Missing: explicit logs for "medical board review", "risk disclosure event", "doctor review event" as discrete event types
O.2 Audit trail answers: Who? Was it AI/human/provider? When? What data? Confidence? Reviewed by human? Doctor approved? Patient informed? Consent valid? Risk disclosed? ⚠️ Most queryable; "Was it reviewed by a human?" and "Doctor approved?" are weakest links because HITL workflow is stub

P. Adverse Event & Sentinel Governance

# Requirement Status
P.1 Adverse event reporting workflow ❌ Not built
P.2 Sentinel event classification ❌ No taxonomy
P.3 RCA workflow ❌ Not built
P.4 Provider review trigger on adverse event ❌ Not wired
P.5 Mortality / ICU escalation / data breach event types ❌ Not classified

This is one of the largest gaps in platform vs framework.


Q. Decision Rights Matrix (Curaway / AI / Doctor / Patient)

The doc specifies who decides what across 13 decision areas. Spot-checked alignment:

Decision Framework expectation Platform reality
Final diagnosis Doctor only ✅ Voice rules block AI/platform; doctor surface exists
Treatment plan Doctor only ✅ Same
Provider choice Platform facilitates, AI supports, patient final ✅ Matching engine outputs ranked list, patient selects
Cost estimate Platform coordinates, AI assists, provider inputs, patient accepts ⚠️ Platform + AI handled; provider input loop weak
Medical fee Provider owns ✅ Commerce module separates
Platform fee Curaway owns
Outcome guarantee "No absolute guarantee" — all parties acknowledge ✅ Voice rules + medical-advice scanner enforce

R. Guardrail Library

Category Framework count Platform enforcement
Clinical guardrails (10 "must never") All 10 ✅ Voice + medical-advice scanners
Legal guardrails (10 "must always") 10 ⚠️ Voice/marketing covered; jurisdiction/arbitration/liability clauses are contract-level (📋 org)
Financial guardrails (8 "must never") 8 ✅ Commerce module enforces fee separation, refund policy, no fixed-outcome pricing
Marketing guardrails (10 "must never") 10 ✅ Voice rules block all listed phrases
Data guardrails (8 "must never") 8 ✅ Consent middleware + RBAC + encryption enforce

S. Approved Language & Standard Disclaimers

Disclaimer Status
Curaway standard disclaimer ("AI-enabled care coordination... outcomes vary") ⚠️ Wording approved in framework; not surfaced in FE footer / first turn
AI output label ("AI-assisted decision support... requires review") ❌ Not displayed on FE cards
Estimate disclaimer ❌ Not on estimate display
Rehab disclaimer 📋 Rehab module not built

T. Organizational / Process artifacts

Item Status
Recommended Governance Committees (6: Medical, AI, Legal & Compliance, Provider, Patient Safety & Escalation, Regulatory & Jurisdiction) ❌ None formally chartered
28 Key Policy Documents (Platform Positioning, AI Clinical Decision Support, HITL Review, Provider Onboarding, Patient Consent & Disclosure, Cross-Border, Payment & Refund, Dynamic Estimate, Rehab, Emergency Escalation, Data Privacy, Marketing, Testimonial, AI Agent Behaviour, Audit Trail, Patient Autonomy, Provider Contracting, Wearable Data, Incident & Breach, Telemedicine, Adverse Event, AI Learning, Data Retention, Vendor, Regulatory Classification, BCDR, Conflict of Interest) ❌ 0 of 28 exist as formal policy docs

Overall scorecard

Dimension Aligned (✅) Partial (⚠️) Gap (❌) Org/Policy (📋)
Tech: Auth, audit, consent, encryption, RBAC 12 2 1 0
AI guardrails (voice, medical advice, language) 10 3 0 0
HITL & escalation 1 3 7 0
Provider governance 2 5 6 2
Financial / Commerce 4 3 3 0
Rehab / Post-op 1 1 7 0
BCDR / Vendor / Data lifecycle 1 4 9 3
Adverse events / Sentinel 0 0 5 0
Policy artifacts (28 listed) 0 0 28

  1. ❌ Adverse Event & Sentinel Governance (Section P) — no module exists; mortality/ICU/data-breach events have no classification, RCA workflow, or provider-review trigger. Largest single defensibility gap.
  2. ❌ HITL escalation workflows (D.3) — risk rules detect, escalation routing is Phase 3 stub. Without this, "Human-in-the-Loop" is documentation, not enforcement.
  3. ❌ Rehab module — surgeon clearance, missed-session triggers, LAMA documentation, wearable disclaimers (C.8) — entirely unbuilt; ADR-0018 transport work touches edges.
  4. ❌ Provider governance — license expiry monitoring, malpractice insurance field, suspension workflow (C.4) — manual today; framework expects automation.
  5. ❌ Emergency keyword detection + routing in patient chat (J.2) — no classifier; emergency indicators today rely on coordinator manual flag.
  6. ❌ Per-provider / surgical-specific consent capture (C.6) — single generic consent; framework requires hospital-specific high-risk surgical consent.
  7. ⚠️ AI Output Label + Estimate Disclaimer + Rehab Disclaimer on FE (S) — approved language exists in doc; not surfaced to patient.
  8. ❌ Retention schedules per data type / legal hold (N.1, N.3) — implicit "keep forever" model; framework requires jurisdiction-specific retention with secure-destruction verification.
  9. ❌ Bulk patient data export (GDPR Article 20) (I.7, N.2) — erasure works; export does not.
  10. ❌ Documented BCDR plan with RPO/RTO + recovery testing (L) — Railway-managed backups exist; no documented restore drill or SLA.

Top 10 priority gaps — ranked by ease of fix (quick wins)

  1. ⚠→✅ Surface AI Output Label on FE cards ("AI-assisted decision support output. Not a final diagnosis.") — small FE change, large defensibility win.
  2. ⚠→✅ First-turn platform role disclosure in conversation v6 (already partially in voice rules; needs explicit prompt rule).
  3. ⚠→✅ Estimate disclaimer text in commerce module invoice template.
  4. ❌→⚠ License-expiry automated alert — cron job querying doctors.license_expiry, Telegram alert at T-30 days.
  5. ❌→⚠ Emergency-keyword regex pre-filter in conversation pipeline → route to coordinator alert (defense-in-depth before formal classifier).
  6. ❌→⚠ Adverse-event taxonomy enum in code (Mortality / ICU_escalation / Surgical_complication / Implant_failure / Data_breach) + Telegram alert category.
  7. ❌→⚠ Document operating doctrine (Section A.3) in docs/architecture/operating-doctrine.md.
  8. ❌→📋 Charter the 6 governance committees — even informal meeting cadence + minutes folder beats zero.
  9. ❌→⚠ Vendor inventory table in docs/reference/vendors.md with DPA-status column.
  10. ❌→⚠ LAMA flag on Case model + audit event.

Net assessment

The platform is strongly aligned on the tech-stack layer (consent, audit, RBAC, encryption, voice rules, medical-advice enforcement, GDPR erasure, telemedicine recording-disabled, FHIR, feature flags, LLM observability, matching transparency). These are the layers most directly enforceable in code, and Curaway has invested heavily here.

The platform is weakly aligned on three governance dimensions:

  1. Clinical operations layer — adverse events, HITL escalation, rehab, surgeon clearance, emergency routing. These are clinical workflows that need both code and a Medical Board to operate.
  2. Provider operations layer — license monitoring, malpractice tracking, scorecards, suspension workflow. Mostly manual today.
  3. Organizational governance layer — 6 committees not chartered, 28 policy documents not drafted. These don't need code, but they're the paper trail that makes the framework defensible in a regulatory or medico-legal review.

The biggest single risk is Adverse Event Governance — if a sentinel event happens today, there is no module to classify it, no RCA workflow to document it, and no automated provider-review trigger. Combined with the HITL escalation gap, this is the clearest concentration of medico-legal exposure relative to framework requirements.

  • Week 1-2 (small, high leverage): Quick-wins 1-3 (AI label, role disclosure, estimate disclaimer) + adverse-event taxonomy enum (#6) + LAMA flag (#10).
  • Sprint (1 month): Build HITL escalation backbone (D.3) + per-provider/surgical consent (C.6) + license expiry job (G.4) + emergency keyword pre-filter (J.2).
  • Quarter: Adverse Event module end-to-end (P) + rehab module foundation (C.8) + BCDR plan + 6 committees chartered + first 10 policy docs drafted.

Audit source code references valid as of commit 245e669. Re-audit recommended after each material module ship (escalation backbone, adverse-event module, rehab module).