ADR-0027 — Procedure Seeder: YAML as Single Source of Truth¶
Date: 2026-05-18 Status: Accepted Deciders: Engineering (AI-assisted), Dr. Shrikanth Naidu (clinical content review) Related issues: #960, #279, #985 (Neo4j projection), PR #990, PR #994
Context¶
config/procedures.yaml was declared the single source of truth in its own header comment,
but app/seeds/seed_procedures.py::PROCEDURES was separately hand-curated with rich clinical
payloads (required documents, comorbidity screening, contraindications, cost ranges, recovery
timelines, travel considerations). This created drift: 32 yaml entries vs 20 Python dict rows,
and 6 yaml-only procedures with zero procedure_requirements row in the database.
The backfill script (scripts/backfill_procedure_documents.py) imported PROCEDURES directly,
so it also missed the 6 yaml-only codes.
Decision¶
Option A — yaml is the single authoring surface; seed_procedures.py::PROCEDURES is a derived
loader.
The yaml schema is extended with seven new optional fields per entry:
parent_procedure_code, category, snomed_primary, required_documents, required_tests,
comorbidity_screening, contraindications, cost_range, recovery_timeline,
travel_considerations.
seed_procedures.py adds _load_procedures_yaml() and rebuilds:
The base templates (ORTHO_BASE, CARDIAC_BASE, ONCO_BASE) stay hand-coded because they
are not catalog procedures — they are inheritance parents used only by the seeder.
Inheritance algorithm¶
Shallow merge with list-concatenation for INHERITABLE_LIST_FIELDS:
INHERITABLE_LIST_FIELDS = {"required_documents", "required_tests", "comorbidity_screening", "contraindications"}
- For scalar fields (category, cost_range, etc.), the child value wins; parent fills if absent.
- For list fields in
INHERITABLE_LIST_FIELDS, parent list is prepended and child list appended (parent provides defaults; child adds procedure-specific items). - A child entry with an empty list (
required_documents: []) explicitly overrides the parent to empty — empty list is NOT treated as "absent". Usenull/ omit the key to inherit.
Marker grammar¶
Examples:
- fabricated_pending_ops_2026_05_17 (existing Phase-2 entries)
- fabricated_pending_ops_2026_05_19 (new strawman entries from this PR)
- naidu_approved_clinical_sweep_2026_06_01 (post-#169 sign-off)
The assert_marker_valid() helper in app/seeds/_base.py enforces this pattern.
CI test tests/seeds/test_marker_grammar.py walks every yaml and every metadata->>data_source
value in a freshly-seeded test DB.
Generalised seeder architecture (Section 9)¶
This ADR commits to the pattern established for all future entities:
config/<entity_plural>/seed.yaml ← single authoring surface
app/seeds/_validators.py ← Pydantic validation (fail-loud at import)
app/seeds/_inheritance.py ← shallow-merge + list-concat resolver
app/seeds/_base.py ← SeederBase: upsert, dry-run, diff, marker validation
app/seeds/<entity>_seeder.py ← idempotent Postgres upsert
app/seeds/_runner.py ← DAG-walking master runner
The migration roadmap for all other entities is in Section 10 of
docs/superpowers/plans/2026-05-18-required-documents-gap-fill.md.
Trade-offs¶
| Pro | Con |
|---|---|
| yaml is already the ops/clinical review surface | yaml carries ~3000 lines of clinical detail (previously readable Python dicts) |
| Naidu reviews one file, not two | YAML strings lack IDE type-checking on field names |
| Loader validates via Pydantic at import — bad authoring fails loud | Import-time validation adds ~50ms to seeder startup |
| New fields added to yaml without code change | Base templates remain Python — one more place to update |
PROCEDURES symbol unchanged → existing tests pass |
Consequences¶
config/procedures.yamlgrows to ~600 lines with clinical payload. This is expected and acceptable — yaml is designed for this.app/seeds/seed_procedures.pyshrinks by ~2000 lines as Python dicts are removed.- All 6 yaml-only procedures (
67036,67228,92920,ONCO-CHEMO,ONCO-RAD,ONCO-SURG) now haverequired_documentsauthored and seeded. OPHTHALMOLOGY_BASEis added as a new base template in the Python list.scripts/backfill_procedure_documents.pyis superseded byscripts/backfill_procedure_clinical_payload.pywhich sources from the yaml loader.- The old
app/seed_*.pymodules at the app root become deprecation shims.