Insights

The Search Problem in Retail: Why Personalisation Still Fails

Daniel Nguyen 28 November 2022

Retail search has a credibility problem that the industry has been papering over for years. Ask any e-commerce team lead about their search experience and you'll get one of two answers: either they've accepted that search conversion is 15–25% below what it should be and built merchandising workarounds to compensate, or they're genuinely proud of a search stack that turns out, on closer inspection, to be a well-tuned BM25 index with a thin personalisation layer on top. The second group is usually more dangerous than the first, because they've stopped looking for the problem.

Before joining Banksia, I spent several years building the recommendation and search systems at an Australian marketplace with eight million monthly active buyers. The thing that became clear very quickly was that keyword matching — even with good synonym expansion, spell correction, and query rewriting — fails at the semantic level in ways that matter a lot for conversion. A buyer searching for "warm jacket for Melbourne winter" is expressing a multidimensional need: temperature range, style context, likely price sensitivity, probably a preference for certain materials. A keyword system resolves that query to a set of tokens and returns results ranked by textual relevance and historical click data. An ML system trained on historical purchase behaviour gets closer. But neither is doing the thing the buyer actually wants, which is to be understood as a person with a specific context, not as a bag of words.

The gap between what the research literature has been able to do with dense vector retrieval, cross-encoder re-ranking, and session-level contextual signals, and what most retailers have actually deployed in production, is enormous. We're talking about systems that have been published in academic settings for three to five years that are not running on the majority of mid-market retail search stacks. The reasons are partly engineering resource constraints, partly the difficulty of curating the training data for a dense retrieval index specific to a retailer's catalogue, and partly the fact that the ROI case — while real — is hard to measure cleanly against a baseline that nobody wants to admit is broken.

What we find interesting about companies like Particular Audience, which we backed in 2023, is the approach of building personalised discovery as a layer over existing retailer infrastructure rather than asking retailers to replace their search stack. The insight is that most of the value from intelligent product discovery doesn't require replacing the underlying index — it requires a smarter orchestration layer that can intercept queries, enrich them with user context, and route results through a ranking model that understands individual buyer preference. That's a much lower integration lift for a retailer, and it means the value can be captured incrementally rather than requiring a full rip-and-replace. That product architecture decision reflects a real understanding of how change actually happens inside retail organisations.

We're not bullish on every AI personalisation company we see. The category has attracted a lot of pitch-deck personas that describe sophisticated ML systems but, on technical diligence, turn out to be cosine similarity on pre-trained embeddings with a thin A/B test framework around it. The differentiator we look for is teams that can explain their training data strategy, their cold-start behaviour, and their approach to catalogue sparsity — the hard problems that mature retail environments actually surface. Those conversations quickly separate the teams that have operated this class of system in production from the ones who've read the papers.