Skip to content

06 — Matching Engine Design

Pluggable Strategy Pattern

Strategy Version Status Description
Weighted Rules v1 MVP default clinical_fit(0.4) + outcomes(0.2) + cost(0.15) + travel(0.15) + preferences(0.1)
Agent-Enhanced v1.5 MVP (flagged) Weighted Rules + LLM pre-analysis + edge case re-ranking
ML Ranking v2 Stub Learning-to-rank trained on match acceptance signals
Hybrid v3 Stub Rules + ML ensemble with configurable blend

Active strategy selected via Flagsmith feature flag matching_strategy.

Four-Stage Pipeline

Stage 0: Semantic Discovery (Qdrant) — ALWAYS ON

Voyage AI embeddings encode patient clinical profile. Qdrant ANN search finds semantically similar providers/procedures, discovering candidates that exact code matches might miss.

  • Input: Patient clinical embedding (conditions + procedures + preferences as text)
  • Output: Candidate provider IDs with semantic similarity scores
  • Never skipped. Qdrant is Stage 0, not optional enrichment.

Stage 1: Hard Constraint Filtering (Neo4j)

Graph traversal eliminates providers that cannot serve this patient. Pass/fail, no partial scoring.

MATCH (p:Patient {patient_id: $patient_id})-[:DIAGNOSED_WITH]->(c:Condition)
      -[:INDICATED_FOR]->(proc:Procedure)
      <-[:PERFORMS]-(prov:Provider)-[:ACCREDITED_BY]->(a:Accreditation)
WHERE prov.status = 'active'
  AND proc.cpt_code IN $required_procedures
  AND EXISTS((prov)-[:LOCATED_IN]->(:Location)<-[:HAS_VISA_CORRIDOR]-(:Country {code: $patient_nationality}))
RETURN prov, proc, a

Hard constraints: - Provider has indicated procedure capability (CPT match) - Provider has relevant specialty accreditation - Medical visa corridor exists (nationality → destination) - Provider not in blocked status - Patient consent covers destination jurisdiction

Stage 2: Weighted Scoring (PostgreSQL)

Deterministic scoring across 5 active domains. Weights configurable via Flagsmith.

Domain Weight Sub-Parameters Source
Clinical Fit 0.40 Primary diagnosis match, procedure capability, comorbidity management, pre-op test availability, surgeon volume FHIR resources, Neo4j
Outcomes 0.20 Success rate, complication rate, satisfaction, mortality index Provider data, Neo4j
Cost 0.15 Procedure cost, accommodation, travel, package transparency Provider pricing, Frankfurter FX
Travel & Logistics 0.15 Visa probability, flight connectivity, timezone diff, companion accommodation Regulatory data, Neo4j
Patient Preferences 0.10 Language concordance, dietary compliance, gender preference, religious accommodation Patient profile, provider capabilities

Confidence scoring: Data completeness reduces confidence, not ranking. Sparse profiles → lower confidence score (uncertainty), not penalized position.

def calculate_confidence(provider_data: dict) -> float:
    """
    Confidence = data completeness ratio.
    Provider with 80% of expected data fields → confidence 0.80.
    Provider with 40% → confidence 0.40.
    Displayed alongside match score to indicate reliability.
    """

Stage 3: LLM Re-Ranking & Explanation (Claude)

When agent_enhanced_matching flag enabled:

  1. Claude Sonnet reviews top 5 scored providers for:
  2. Comorbidity interaction risks (diabetes + TKR anesthesia)
  3. Contraindications missed by deterministic scoring
  4. Edge cases where lower-scored provider is better for this patient
  5. Claude Haiku generates natural language explanations per provider in patient locale

When flag disabled: stages 0 and 3 skipped. Pure deterministic WeightedScoringV1. Zero regression risk.

Match Result Schema

class MatchResult:
    match_id: str
    patient_id: str
    tenant_id: str
    strategy_used: str                    # 'weighted_v1', 'agent_enhanced_v1.5'
    providers: list[ProviderMatch]        # ranked list
    metadata: dict                        # timing, model used, token count
    created_at: datetime

class ProviderMatch:
    provider_id: str
    rank: int
    total_score: float                    # 0.0–1.0
    confidence: float                     # data completeness
    domain_scores: dict[str, float]       # per-domain breakdown
    explanation: str                      # natural language (locale-aware)
    explanation_locale: str               # 'ar', 'en', etc.
    flags: list[str]                      # 'comorbidity_risk', 'limited_data', etc.

Shadow Mode & A/B Testing

New strategies run in shadow mode: 1. New strategy executes alongside current strategy 2. Both results logged to events table 3. Only current strategy's results served to patient 4. Compare quality in Metabase before enabling

A/B testing coordinated across: - Flagsmith: Strategy assignment (which users get which strategy) - Events table: Match outcomes and scores for both strategies - PostHog: User behavioral data (which results they click, how long they review) - Metabase: Comparative analysis dashboards

Matching Engine Interface

class MatchingEngine:
    async def execute(
        self, patient_id: str, tenant_id: str,
        strategy: str = None  # override, else from Flagsmith
    ) -> MatchResult:
        strategy = strategy or self.get_strategy_from_flagsmith(tenant_id)
        return await self.strategies[strategy].run(patient_id, tenant_id)

class MatchingStrategy(Protocol):
    async def run(self, patient_id: str, tenant_id: str) -> MatchResult: ...

class WeightedScoringV1(MatchingStrategy): ...
class AgentEnhancedV15(MatchingStrategy): ...
class MLRankingV2(MatchingStrategy): ...      # stub
class HybridV3(MatchingStrategy): ...          # stub