Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)

Update: 2025-07-02

Description

Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?

In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people do their jobs.

We cover:

• How to align retrieval with user intent and why cosine similarity is not the answer

• How a dumb YAML-based system outperformed so-called smart retrieval pipelines

• Why vague queries like “what is this all about” expose real weaknesses in most systems

• When vibe checks are enough and when formal evaluation is worth the effort

• How retrieval workflows can evolve alongside your product and user needs

If you are building LLM-powered systems and care about how they work, not just whether they work, this one is for you.

LINKS

Eric's website

Upcoming Events on Luma

Hugo's recent newsletter about upcoming events and more!

🎓 Learn more:

Hugo's course: Building LLM Applications for Data Scientists and Software Engineers — next cohort starts July 8: https://maven.com/s/course/d56067f338

📺 Watch the video version on YouTube: YouTube link

Comments

In Channel

Episode 62: Practical AI at Work: How Execs and Developers Can Actually Use LLMs

2025-10-3159:04

Episode 61: The AI Agent Reliability Cliff: What Happens When Tools Fail in Production

2025-10-1628:04

Episode 60: 10 Things I Hate About AI Evals with Hamel Husain

2025-09-3001:13:15

Episode 59: Patterns and Anti-Patterns For Building with AI

2025-09-2347:37

Episode 58: Building GenAI Systems That Make Business Decisions with Thomas Wiecki (PyMC Labs)

2025-09-0901:00:45

Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)

2025-08-2941:27

Episode 56: DeepMind Just Dropped Gemma 270M... And Here’s Why It Matters

2025-08-1445:40

Episode 55: From Frittatas to Production LLMs: Breakfast at SciPy

2025-08-1238:08

Episode 54: Scaling AI: From Colab to Clusters — A Practitioner’s Guide to Distributed Training and Inference

2025-07-1841:17

Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin on Shipping Reliable LLMs

2025-07-0844:49

Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)

2025-07-0228:38

Episode 51: Why We Built an MCP Server and What Broke First

2025-06-2647:41

Episode 50: A Field Guide to Rapidly Improving AI Products -- With Hamel Husain

2025-06-1727:42

Episode 49: Why Data and AI Still Break at Scale (and What to Do About It)

2025-06-0501:21:45

Episode 48: HOW TO BENCHMARK AGI WITH GREG KAMRADT

2025-05-2301:04:25

Episode 47: The Great Pacific Garbage Patch of Code Slop with Joe Reis

2025-04-0701:19:12

Episode 46: Software Composition Is the New Vibe Coding

2025-04-0301:08:57

Episode 45: Your AI application is broken. Here’s what to do about it.

2025-02-2001:17:30

Episode 44: The Future of AI Coding Assistants: Who’s Really in Control?

2025-02-0401:34:11

Episode 43: Tales from 400+ LLM Deployments: Building Reliable AI Agents in Production

2025-01-1601:01:03

00:00

1.0x

Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)

#box-pro-ellipsis-176244005395315{-webkit-line-clamp:2;}Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)

Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)

Hugo Bowne-Anderson

Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)