← Writing

Hybrid search beat a bigger model

May 20, 2026

RAGretrieval

When the university assistant gave a wrong answer, my first instinct was the wrong one: reach for a bigger model. It rarely helped. The model wasn’t the problem — it was answering faithfully from passages that didn’t contain the answer in the first place. Garbage in, confident garbage out.

The fix was retrieval, not generation.

Semantic search alone has a blind spot

Pure vector search is great at meaning. Ask “how do I request time off” and it finds the leave-policy page even if those exact words never appear. But it’s oddly bad at the things institutions care about most: a form number, an acronym, an exact policy code. “Form 3401-B” embeds into roughly the same place as “Form 3402-B”, and now you’re quoting the wrong form.

Keyword search has the opposite strengths. So I run both and merge them.

flowchart LR
Q[Question] --> S[Semantic<br/>search]
Q --> K[Keyword<br/>search]
S --> M[Merge]
K --> M
M --> R[Rerank]
R --> T[Top passages<br/>to the model]

Two retrievers, one merged-and-reranked result set.

What actually moved the needle

  • Reranking the merged set. Pulling 20 candidates from each retriever and letting a reranker pick the best 5 mattered more than any model swap.
  • Keeping exact terms exact. The keyword half rescued every question that hinged on a code, a name, or a number.
  • Measuring it. I built a small set of real questions with known-good passages, so “did retrieval get better” became a number instead of a vibe.

The bigger model is still there if I want it. But most days, the boring combination of two retrievers and a reranker is what made people stop saying “that’s wrong.”