|
This week we’ll talk a bit about late interaction. But to get there, we need to think about why single vector representations fail. Let’s think about restaurants. Here’s an article reviewing local restaurants. I have three Italian restaurants and two Chinese ones. What’s the average of these? Russian or something!? Maybe Middle Eastern food? If my document lists these restaurants, then that’s exactly what I’ll get in a single vector encoding. A confusing muddle somewhere in the middle of every cuisine. The user comes along looking for “best Italian restaurant in my town” and they don’t get my document listing 3 Italian and 2 Chinese. Because the cosine similarity between the query, “italian restaurant,” and this document, lost somewhere in the Middle East, has become so low. As documents grow in complexity, the problem only worsens This sort of failure mode happens all the time with embeddings, where forever reason the whole washes out the parts. In any information-heavy search, the tension between retrieving the whole and narrowing in on the particular, sometimes diverse, facts in a document becomes stronger. And that’s why this week we’ll learn one approach: late interaction! -Doug PS today, 12:30PM ET is the last day to sign up for Cheat at Search with Agents: http://maven.com/softwaredoug/cheat-at-search Events · Consulting · Training (use code search-tips) You're subscribed to Doug Turnbull's daily search tips where I share tips, blog articles, events, and more. You can always manage your profile: |
I share search tips, blog articles, and free events I'm hosting about the search+retreval industry, vector databases, information retrieval and more.
Good vector search means more than embeddings. Embeddings don’t know when a result matches / doesn’t match. Similarity floors don’t work consistently - a cutoff that works for one query might be disastrous for another. Even worse: your embedding usually can’t capture every little bit of meaning from your corpus. You need to efficiently pick the best top N candidates from your vector database. What do you need? Query Understanding - translating the query to domain language (categories, colors,...
Reciprocal Rank Fusion merges one system’s search ranking with another’s (ie lexical + embedding search). RRF scores a document with ∑1/rank of each underlying system. I’ve found RRF is not enough. Here’s the typical pattern I see on teams: A mature lexical solution exists. It’s pretty good, The team wants to add untuned, embedding based retrieval, They deploy a vector DB, and RRF embedding results with the mature system, Disaster ensues! The poor embedding results drag down the lexical...
Just sharing my post on Bayesian BM25 and other ways of normalizing BM25 scores. Enjoy! https://softwaredoug.com/blog/2026/03/06/probabilistic-bm25-utopia Do you have any thoughts on normalizing BM25 scores? -Doug Events · Consulting · Training (use code search-tips) You're subscribed to Doug Turnbull's daily search tips where I share tips, blog articles, events, and more. You can always manage your profile: