Vector search means query understanding!? (daily search tip)


Good vector search means more than embeddings.

Embeddings don’t know when a result matches / doesn’t match. Similarity floors don’t work consistently - a cutoff that works for one query might be disastrous for another. Even worse: your embedding usually can’t capture every little bit of meaning from your corpus.

You need to efficiently pick the best top N candidates from your vector database.

What do you need?

  • Query Understanding - translating the query to domain language (categories, colors, etc?) likely to produce the best results
  • Filters - Exclude from scoring results that would obviously be irrelevant
  • Boosts - Promote items close to the information need in ways not expressed in your embedding. Bring up the most popular, the one with shipping availability, etc.

Vector search is not enough, search requires a full suite of solutions to work.

-Doug

Events · Consulting · Training (use code search-tips)

You're subscribed to Doug Turnbull's daily search tips where I share tips, blog articles, events, and more. You can always manage your profile:

Doug Turnbull

I share search tips, blog articles, and free events I'm hosting about the search+retreval industry, vector databases, information retrieval and more.

Read more from Doug Turnbull

It's the final week before my Cheat at Search with Agents course. But we've got a fun tailgaite party before the big event 😀. Agent harnesses are dead - long live harnesses Today, 7PM ET - https://maven.com/p/6dc9ef/building-effective-agent-harnesses Search backend design means agent design. Hugo Bowne Anderson lives by a KISS ethos - Keep it Simple Smartypants. He talks real-life production LLM + harness design. You don't need to chase the latest OpenClaw -> Hermes -> Claude Code design...

Give an agent a set of search tools, it finds relevant products and improves result ranking. So should we throw away our traditional search stack and just let an agent drive some retrievers? Will the future of search APIs just be an agent, not query understanding or reranking? Here's the rub - finding things with agent's help differs from helping agents find information. In one case, the agent helps us. In the other, we must help the agent find what it doesn't know. This last case can't work...

Search community happenings for this week! 2026 is the year of agentic search w/ Jo Kristian Bergum Thursday April 30th - https://maven.com/p/a4f265/2026-will-be-the-year-of-agentic-search What's happening in Information Retrieval in 2026? Agentic search! This is THE topic everyone is focused on. Agents searching for us. Agents performing deep research. Agentic models like SID-1 focused on fast search (replacing your search API?). And so on and on. I'll be hosting a conversation with Jo...