Skip to content

ADR-0002: Voyage AI over OpenAI Embeddings

Status: Accepted Date: 2026-03-20 Session: 11

Context

Curaway's provider matching pipeline uses semantic search to find clinical trial sites and specialists relevant to a patient's condition. This requires converting both patient records and provider profiles into vector embeddings, stored in Qdrant, and performing nearest-neighbor lookups.

The quality of embeddings directly impacts match relevance. Medical text contains domain-specific terminology (ICD-10 codes, drug names, anatomical references) that general-purpose embedding models may not handle well.

Decision

Use Voyage AI voyage-3.5-lite (1024 dimensions) as the embedding model for all vector operations in Qdrant, instead of OpenAI text-embedding-3-small (1536 dimensions).

Rationale

  • Superior retrieval quality on medical text. Voyage AI models are benchmarked specifically for domain-specific retrieval tasks. Internal testing on Curaway's provider corpus showed measurably better recall@10 for medical queries compared to OpenAI embeddings.
  • Generous free tier. Voyage AI offers 50 million tokens per month free, compared to OpenAI's pay-per-use model (~$0.02/1M tokens). At Curaway's current scale, embedding costs are effectively zero with Voyage AI.
  • Lower dimensionality. 1024 dimensions vs 1536 dimensions means ~33% less storage in Qdrant and faster similarity computations. This matters as the provider corpus grows.
  • No vendor lock-in with OpenAI for everything. Using a different provider for embeddings reduces single-vendor dependency. The LLM layer (Claude) is already separate from OpenAI.

Alternatives Considered

Alternative Pros Cons Verdict
OpenAI text-embedding-3-small Widely adopted, well-documented, strong general-purpose quality Pay-per-use adds up, 1536 dims increases storage, less specialized for medical text Rejected
Cohere embed-v3 Good multilingual support, search-optimized Smaller free tier than Voyage, less medical-domain benchmarking available Deferred
Self-hosted BGE-M3 No API dependency, full control, zero marginal cost Requires GPU infrastructure, operational burden of model serving, cold-start latency Rejected for now

Consequences

  • Positive: Embedding costs are zero at current scale. Better retrieval quality means more relevant provider matches.
  • Positive: Smaller vector dimensions reduce Qdrant memory footprint and speed up search.
  • Negative: Voyage AI is a smaller company than OpenAI. If they change pricing or shut down, we need to re-embed the entire corpus with a new model.
  • Negative: Fewer community examples and integrations compared to OpenAI embeddings.
  • Mitigation: The embedding pipeline is abstracted behind an EmbeddingService interface, so swapping models requires changing one configuration value and running a re-indexing job.