Hybrid search beat a bigger model
When the university assistant gave a wrong answer, my first instinct was the wrong one: reach for a bigger model. It rarely helped. The model wasn’t the problem — it was answering faithfully from passages that didn’t contain the answer in the first place. Garbage in, confident garbage out.
The fix was retrieval, not generation.
Semantic search alone has a blind spot
Pure vector search is great at meaning. Ask “how do I request time off” and it finds the leave-policy page even if those exact words never appear. But it’s oddly bad at the things institutions care about most: a form number, an acronym, an exact policy code. “Form 3401-B” embeds into roughly the same place as “Form 3402-B”, and now you’re quoting the wrong form.
Keyword search has the opposite strengths. So I run both and merge them.
flowchart LR Q[Question] --> S[Semantic<br/>search] Q --> K[Keyword<br/>search] S --> M[Merge] K --> M M --> R[Rerank] R --> T[Top passages<br/>to the model]
Two retrievers, one merged-and-reranked result set.
What actually moved the needle
- Reranking the merged set. Pulling 20 candidates from each retriever and letting a reranker pick the best 5 mattered more than any model swap.
- Keeping exact terms exact. The keyword half rescued every question that hinged on a code, a name, or a number.
- Measuring it. I built a small set of real questions with known-good passages, so “did retrieval get better” became a number instead of a vibe.
The bigger model is still there if I want it. But most days, the boring combination of two retrievers and a reranker is what made people stop saying “that’s wrong.”