Matching Pipeline¶

Overview¶

The Curaway matching engine uses a multi-stage pipeline to match patients with the best provider + doctor combination.

Stages¶

Stage 1: Qdrant Semantic Search¶

Query: patient conditions + procedures as natural language
Collection: providers (42 vectors)
Returns: cosine similarity scores per provider

Stage 2: Neo4j Graph Traversal¶

Path: Patient → HAS_CONDITION → Condition → REQUIRES → Procedure ← OFFERS ← Provider
Returns: providers with OFFERS metadata (cost, volume, success rate), accreditations, recovery phases, required tests
Also: Doctor → PERFORMS → Procedure with doctor-level outcomes

Stage 3: PostgreSQL Scoring¶

7 weighted dimensions: clinical_relevance (0.25), outcome_score (0.20), semantic_match (0.10), cost_score (0.15), travel_logistics (0.10), accreditation (0.10), patient_preferences (0.10)
Weight redistribution when dimensions have missing data
Doctor-level scoring with data completeness confidence factor

Stage 4: LLM Enhancement (Feature-flagged)¶

Agent-enhanced matching: re-ranking via match_agent
Agent-enhanced explanations: natural language per provider
Template-based doctor match reasoning (LLM deferred)

Doctor-Level Scoring (Session 26)¶

When DOCTORS_IN_MATCHING feature flag is ON: 1. For each procedure needed, fetch doctors via get_doctors_for_procedure() 2. Score language concordance (6 tiers: 1.0 native → 0.0 none) 3. Build procedure stats (volume, success rate, technique, PROMs) 4. Generate template-based match reasoning 5. Apply confidence: final_score = provider_score × data_completeness_score 6. Include doctor and language_support objects in match response

Qdrant Collections¶

Collection	Vectors	Purpose
`providers`	42	Provider semantic search
`requirement_embeddings`	70	Document-to-requirement matching
`conditions`	12	Condition semantic search
`document_embeddings`	Grows	Individual document embeddings

Note: Doctor embeddings are a future addition — deferred until after Session 26 is stable.