Matching Pipeline¶
Overview¶
The Curaway matching engine uses a multi-stage pipeline to match patients with the best provider + doctor combination.
Stages¶
Stage 1: Qdrant Semantic Search¶
- Query: patient conditions + procedures as natural language
- Collection:
providers(42 vectors) - Returns: cosine similarity scores per provider
Stage 2: Neo4j Graph Traversal¶
- Path: Patient → HAS_CONDITION → Condition → REQUIRES → Procedure ← OFFERS ← Provider
- Returns: providers with OFFERS metadata (cost, volume, success rate), accreditations, recovery phases, required tests
- Also: Doctor → PERFORMS → Procedure with doctor-level outcomes
Stage 3: PostgreSQL Scoring¶
- 7 weighted dimensions: clinical_relevance (0.25), outcome_score (0.20), semantic_match (0.10), cost_score (0.15), travel_logistics (0.10), accreditation (0.10), patient_preferences (0.10)
- Weight redistribution when dimensions have missing data
- Doctor-level scoring with data completeness confidence factor
Stage 4: LLM Enhancement (Feature-flagged)¶
- Agent-enhanced matching: re-ranking via match_agent
- Agent-enhanced explanations: natural language per provider
- Template-based doctor match reasoning (LLM deferred)
Doctor-Level Scoring (Session 26)¶
When DOCTORS_IN_MATCHING feature flag is ON:
1. For each procedure needed, fetch doctors via get_doctors_for_procedure()
2. Score language concordance (6 tiers: 1.0 native → 0.0 none)
3. Build procedure stats (volume, success rate, technique, PROMs)
4. Generate template-based match reasoning
5. Apply confidence: final_score = provider_score × data_completeness_score
6. Include doctor and language_support objects in match response
Qdrant Collections¶
| Collection | Vectors | Purpose |
|---|---|---|
providers |
42 | Provider semantic search |
requirement_embeddings |
70 | Document-to-requirement matching |
conditions |
12 | Condition semantic search |
document_embeddings |
Grows | Individual document embeddings |
Note: Doctor embeddings are a future addition — deferred until after Session 26 is stable.