Matching Engine¶

Overview¶

The Matching Engine is Curaway's core differentiator -- the system that connects patients with the right providers and doctors based on clinical needs, preferences, outcomes data, and logistics. It uses a pluggable strategy pattern that allows multiple matching algorithms to coexist, be tested in shadow mode, and be swapped at runtime via feature flags.

Strategy Pattern¶

Architecture¶

graph TD
    API[POST /api/v1/cases/{id}/match] --> Router[Strategy Router]
    Router --> Flag{Flagsmith: matching_strategy_version}
    Flag -->|"v2.1"| GSW[Graph+Semantic Weighted v2.1]
    Flag -->|"v1.0"| WR[Weighted Rules v1.0]
    Flag -->|"v1.5"| AE[Agent-Enhanced v1.5]
    Flag -->|"v2.0"| ML[ML Ranking v2.0]
    Flag -->|"v3.0"| HY[Hybrid v3.0]

    style Router fill:#008B8B,color:#fff
    style GSW fill:#FF7F50,color:#fff
    style AE fill:#4A90D9,color:#fff

Strategy Interface¶

Every matching strategy implements a common interface:

class MatchingStrategy(ABC):
    """Base class for all matching strategies."""

    @abstractmethod
    async def match(
        self,
        case: Case,
        clinical_data: ClinicalData,
        preferences: PatientPreferences,
        tenant_id: str,
    ) -> MatchResult:
        """Execute matching and return scored results."""
        ...

    @abstractmethod
    def get_version(self) -> str:
        """Return strategy version identifier."""
        ...

    @abstractmethod
    def get_scoring_dimensions(self) -> list[str]:
        """Return list of scoring dimensions used."""
        ...

Available Strategies¶

Strategy	Version	Status	Description
Graph+Semantic Weighted	v2.1	Active (default)	Neo4j traversal + Qdrant semantic + weighted scoring
Weighted Rules	v1.0	Legacy	Pure rule-based scoring without graph or semantic components
Agent-Enhanced	v1.5	Feature-flagged	LLM-enhanced reranking on top of v2.1 results
ML Ranking	v2.0	Stub	Future: learned-to-rank model from historical outcomes
Hybrid	v3.0	Stub	Future: ensemble of v2.1 + ML + agent signals

Strategy Selection

The active strategy is selected per-tenant via Flagsmith flag matching_strategy_version. This allows different tenants to run different strategies, and enables gradual rollout of new strategies.

Active Strategy: Graph+Semantic Weighted v2.1¶

Scoring Dimensions¶

The active strategy scores providers across 7 weighted dimensions:

Dimension	Weight	Source	Description
`clinical_relevance`	0.25	Neo4j + FHIR	How well the provider's offerings match the patient's clinical needs
`outcome_score`	0.20	Neo4j OFFERS metadata	Success rate, complication rate, volume-based confidence
`cost_score`	0.15	Neo4j OFFERS metadata	Cost relative to patient budget and market average
`semantic_match`	0.10	Qdrant cosine similarity	Semantic similarity between patient needs and provider profile
`travel_logistics`	0.10	Computed	Visa requirements, flight connections, time zone difference
`accreditation`	0.10	Neo4j HAS_ACCREDITATION	JCI, NABH, and other accreditation presence
`patient_preferences`	0.10	Patient profile	Language match, country preference, dietary accommodation

STRATEGY_WEIGHTS = {
    "clinical_relevance": 0.25,
    "outcome_score": 0.20,
    "cost_score": 0.15,
    "semantic_match": 0.10,
    "travel_logistics": 0.10,
    "accreditation": 0.10,
    "patient_preferences": 0.10,
}

Weights Must Sum to 1.0

The system validates that all weights sum to 1.0 at startup. If weight redistribution is applied (due to missing data), the redistributed weights are also validated.

Execution Flow¶

sequenceDiagram
    participant API
    participant Router as Strategy Router
    participant Neo as Neo4j
    participant QD as Qdrant
    participant Scorer as Scoring Engine
    participant DB as PostgreSQL

    API->>Router: match(case, clinical_data, preferences)
    Router->>Neo: Cypher: Find providers offering required procedures
    Neo-->>Router: Candidate providers with OFFERS metadata
    Router->>QD: Semantic search: patient needs vs. provider vectors
    QD-->>Router: Cosine similarity scores
    Router->>Scorer: Score candidates across 7 dimensions
    Scorer->>Scorer: Apply weights, normalize scores
    Scorer->>Scorer: Weight redistribution (if missing data)
    Scorer-->>Router: Ranked results with per-dimension scores
    Router->>DB: Store match_results
    Router-->>API: MatchResult with explanations

Graph-Enhanced Flow Detail¶

Step 1: Neo4j Traversal

MATCH (c:Condition {code: $condition_code})-[:REQUIRES]->(proc:Procedure)
      <-[offers:OFFERS]-(prov:Provider)-[:LOCATED_IN]->(loc:Location)
WHERE prov.tenant_id = $tenant_id
  AND prov.is_active = true
OPTIONAL MATCH (prov)-[:HAS_ACCREDITATION]->(acc:Accreditation)
RETURN prov, proc, offers, loc, COLLECT(DISTINCT acc) AS accreditations

Step 2: OFFERS Metadata Extraction

The OFFERS relationship carries rich metadata that feeds directly into scoring:

offers_data = {
    "cost_usd": relationship["cost_usd"],
    "annual_volume": relationship["annual_volume"],
    "success_rate": relationship["success_rate"],
    "average_los_days": relationship["average_los_days"],
    "wait_time_weeks": relationship["wait_time_weeks"],
    "package_includes": relationship["package_includes"],
}

Step 3: Scoring with Graph Data

Each dimension scorer receives the full context:

async def score_outcome(
    provider: Provider,
    offers_data: dict,
    procedure_code: str,
) -> float:
    """Score provider outcomes for the specific procedure."""
    success_rate = offers_data.get("success_rate", 0)
    volume = offers_data.get("annual_volume", 0)

    # Volume-based confidence: more procedures = more reliable data
    volume_confidence = min(volume / 200, 1.0)  # Caps at 200/year

    # Weighted combination
    raw_score = (success_rate / 100) * 0.7 + volume_confidence * 0.3

    return round(raw_score, 4)

Doctor-Level Scoring (Session 26)¶

DOCTORS_IN_MATCHING Flag¶

When the DOCTORS_IN_MATCHING Flagsmith flag is enabled, the matching engine extends scoring to individual doctors within each matched provider.

graph TD
    A[Provider-Level Match] --> B{DOCTORS_IN_MATCHING?}
    B -->|Yes| C[Fetch Affiliated Doctors]
    C --> D[Score Each Doctor]
    D --> E[Language Concordance]
    D --> F[Procedure-Specific Metrics]
    D --> G[Data Completeness Factor]
    E --> H[Doctor-Enriched Results]
    F --> H
    G --> H
    B -->|No| I[Provider-Only Results]

    style B fill:#008B8B,color:#fff
    style H fill:#FF7F50,color:#fff

Language Concordance Scoring¶

Language concordance between patient and doctor is scored across 6 tiers:

Tier	Score	Definition	Example
Native	1.00	Doctor speaks patient's native language natively	Hindi patient, Hindi-native doctor
Fluent	0.85	Doctor is fluent in patient's language	English patient, English-fluent doctor in India
Professional	0.70	Professional working proficiency	Arabic patient, doctor with Arabic professional cert
Conversational	0.50	Basic conversational ability	Turkish patient, doctor with basic Turkish
Interpreter Available	0.30	Provider offers interpretation services	Thai patient, hospital has Thai medical interpreter
None	0.00	No language overlap, no interpretation	No common language

def score_language_concordance(
    patient_language: str,
    doctor_languages: list[dict],
    provider_language_services: dict,
) -> float:
    """Score language match between patient and doctor."""
    for lang in doctor_languages:
        if lang["language"].lower() == patient_language.lower():
            proficiency_scores = {
                "native": 1.00,
                "fluent": 0.85,
                "professional": 0.70,
                "conversational": 0.50,
            }
            return proficiency_scores.get(lang["proficiency"], 0.50)

    # Check provider-level interpretation services
    interpreters = provider_language_services.get("medical_interpreters", [])
    if patient_language in interpreters:
        return 0.30

    return 0.00

Data Completeness Confidence Factor¶

Doctor scores are adjusted by a confidence factor derived from data completeness:

def apply_completeness_factor(
    raw_score: float,
    data_completeness: dict,
) -> float:
    """Adjust score based on data completeness confidence."""
    overall_completeness = data_completeness.get("overall", 0.5)

    # Minimum floor of 0.5 to avoid penalizing new doctors too heavily
    confidence = max(overall_completeness, 0.5)

    # Blend raw score toward 0.5 (neutral) based on missing data
    adjusted = raw_score * confidence + 0.5 * (1 - confidence)

    return round(adjusted, 4)

Why Blend Toward 0.5?

When data is incomplete, we don't want to assume the doctor is either great or terrible. Blending toward 0.5 (neutral) means incomplete profiles are ranked in the middle, not at the top or bottom.

Template-Based Match Reasoning¶

Doctor match results include human-readable reasoning generated from templates:

REASONING_TEMPLATES = {
    "high_volume_specialist": (
        "Dr. {name} has performed {volume} {procedure} procedures annually "
        "with a {success_rate}% success rate, placing them in the top tier "
        "of specialists at {provider_name}."
    ),
    "language_concordance": (
        "Dr. {name} speaks {language} at {proficiency} level, enabling "
        "direct communication without interpretation."
    ),
    "technique_match": (
        "Dr. {name} specializes in {technique} {procedure}, which is the "
        "recommended approach for your specific condition profile."
    ),
}

Weight Redistribution¶

When a scoring dimension has insufficient data for a provider, its weight is redistributed proportionally to the other dimensions:

def redistribute_weights(
    base_weights: dict[str, float],
    available_dimensions: set[str],
) -> dict[str, float]:
    """Redistribute weights from unavailable dimensions."""
    unavailable = set(base_weights.keys()) - available_dimensions
    if not unavailable:
        return base_weights

    total_unavailable_weight = sum(base_weights[d] for d in unavailable)
    total_available_weight = sum(base_weights[d] for d in available_dimensions)

    redistributed = {}
    for dim in available_dimensions:
        original = base_weights[dim]
        share = original / total_available_weight
        redistributed[dim] = original + (total_unavailable_weight * share)

    return redistributed

Example: If travel_logistics data is missing (weight 0.10):

Dimension	Original Weight	Redistributed Weight
clinical_relevance	0.25	0.278
outcome_score	0.20	0.222
cost_score	0.15	0.167
semantic_match	0.10	0.111
~~travel_logistics~~	~~0.10~~	--
accreditation	0.10	0.111
patient_preferences	0.10	0.111

Shadow Mode and A/B Testing¶

Shadow Mode¶

New strategies can run in shadow mode alongside the active strategy. Shadow mode executes the new strategy on every match request but discards the results -- only logging them for comparison.

async def match_with_shadow(
    case: Case,
    clinical_data: ClinicalData,
    preferences: PatientPreferences,
    tenant_id: str,
) -> MatchResult:
    """Run active strategy + optional shadow strategy."""
    # Active strategy (returned to patient)
    active_result = await active_strategy.match(case, clinical_data, preferences, tenant_id)

    # Shadow strategy (logged, not returned)
    shadow_flag = await flagsmith.get_flag("matching_shadow_strategy", tenant_id)
    if shadow_flag:
        shadow_strategy = get_strategy(shadow_flag)
        shadow_result = await shadow_strategy.match(case, clinical_data, preferences, tenant_id)
        await log_shadow_comparison(active_result, shadow_result, case.id)

    return active_result

A/B Testing via Flagsmith + PostHog¶

graph LR
    A[Patient Request] --> B{Flagsmith A/B Split}
    B -->|Group A: 80%| C[Strategy v2.1]
    B -->|Group B: 20%| D[Strategy v1.5]
    C --> E[PostHog: Track Outcomes]
    D --> E
    E --> F[Analyze: Conversion, Satisfaction, Time-to-Match]

    style B fill:#008B8B,color:#fff
    style E fill:#FF7F50,color:#fff

A/B tests track these metrics in PostHog:

Metric	Description	Target
`match_click_through`	% of patients who click on a matched provider	> 60%
`consultation_booked`	% of matches that result in a consultation booking	> 25%
`time_to_decision`	Time from match presentation to patient action	< 48 hours
`patient_satisfaction`	Post-match survey score (1-5)	> 4.0

Procedure Requirements API¶

Overview¶

Procedures have specific document requirements (e.g., hip replacement requires recent X-rays, blood work, cardiac clearance). The Procedure Requirements API manages these requirements with support for provider-specific overrides.

Base Requirements¶

class ProcedureRequirement(BaseModel):
    """A document requirement for a procedure."""
    id: UUID
    tenant_id: UUID
    procedure_code: str                     # CPT code
    requirement_type: str                   # "diagnostic", "lab", "imaging", "clearance"
    name: str                               # "Complete Blood Count"
    description: str                        # Detailed description for matching
    max_age_days: int                       # Maximum acceptable age of the document
    is_mandatory: bool = True
    alternatives: list[str] = []            # Alternative acceptable documents

Provider-Specific Overrides¶

Providers can override base requirements (e.g., requiring additional tests or accepting older results):

class ProviderRequirementOverride(BaseModel):
    """Provider-specific override to a base requirement."""
    id: UUID
    tenant_id: UUID
    provider_id: UUID
    procedure_requirement_id: UUID
    max_age_days_override: Optional[int]    # Provider accepts older docs
    is_mandatory_override: Optional[bool]   # Provider makes it optional
    additional_notes: Optional[str]         # Provider-specific instructions
    additional_requirements: list[dict]     # Extra tests this provider needs

API Endpoints¶

Endpoint	Method	Description
`/api/v1/procedures/{code}/requirements`	GET	List base requirements for a procedure
`/api/v1/providers/{id}/procedures/{code}/requirements`	GET	List requirements with provider overrides
`/api/v1/cases/{id}/requirements/status`	GET	Check which requirements are fulfilled for a case

Match Result Schema¶

class MatchResult(BaseModel):
    """Complete match result for a case."""
    id: UUID
    case_id: UUID
    tenant_id: UUID
    strategy_version: str
    strategy_weights: dict[str, float]
    providers: list[ProviderMatch]
    doctors: Optional[list[DoctorMatch]]    # Only if DOCTORS_IN_MATCHING
    executed_at: datetime
    execution_time_ms: int
    shadow_strategy_version: Optional[str]

class ProviderMatch(BaseModel):
    """Individual provider match with scores."""
    provider_id: UUID
    rank: int
    overall_score: float                    # 0.0 - 1.0
    dimension_scores: dict[str, float]      # Per-dimension scores
    weights_used: dict[str, float]          # Actual weights (after redistribution)
    reasoning: str                          # Human-readable explanation
    strengths: list[str]
    considerations: list[str]

class DoctorMatch(BaseModel):
    """Individual doctor match within a provider."""
    doctor_id: UUID
    provider_id: UUID
    rank: int
    overall_score: float
    language_concordance: float
    procedure_metrics: dict
    data_completeness_score: float
    reasoning: str