Listen Top Shows Blog

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

Update: 2025-08-19

Share

Description

Tech leader, investor, and Generationship cofounder Michelle Yi talks to Jon Krohn about finding ways to trust and secure AI systems, the methods that hackers use to jailbreak code, and what users can do to build their own trustworthy AI systems. Learn all about “red teaming” and how tech teams can handle other key technical terms like data poisoning, prompt stealing, jailbreaking and slop squatting.

This episode is brought to you by ⁠Trainium2, the latest AI chip from AWS⁠ and by the ⁠Dell AI Factory with NVIDIA⁠.

Additional materials: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.superdatascience.com/915⁠⁠⁠⁠⁠

Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

In this episode you will learn:

(03:31 ) What “trustworthy AI” means

(31:15 ) How to build trustworthy AI systems

(46:55 ) About Michelle’s “sorry bench”

(48:13 ) How LLMs help construct causal graphs

(51:45 ) About Generationship

Comments

In Channel

930: In Case You Missed It in September 2025

930: In Case You Missed It in September 2025

2025-10-1037:25

929: Dragon Hatchling: The Missing Link Between Transformers and the Brain, with Adrian Kosowski

929: Dragon Hatchling: The Missing Link Between Transformers and the Brain, with Adrian Kosowski

2025-10-0701:13:51

928: The “Lethal Trifecta”: Can AI Agents Ever Be Safe?

928: The “Lethal Trifecta”: Can AI Agents Ever Be Safe?

2025-10-0305:55

927: Automating Code Review with AI, feat. CodeRabbit’s David Loker

927: Automating Code Review with AI, feat. CodeRabbit’s David Loker

2025-09-3001:18:39

926: AI is Disrupting the Legal Industry: Are Paralegals Doomed?

926: AI is Disrupting the Legal Industry: Are Paralegals Doomed?

2025-09-2604:47

925: AI, Automation and the Future of Work, with Oxford’s Prof. Carl Benedikt Frey

925: AI, Automation and the Future of Work, with Oxford’s Prof. Carl Benedikt Frey

2025-09-2301:09:17

924: 95% of Enterprise AI Projects Fail (Per MIT Research)

924: 95% of Enterprise AI Projects Fail (Per MIT Research)

2025-09-1905:27

923: Graph Algorithms, GraphRAG and Causal Graphs, with Graph Guru Amy Hodler

923: Graph Algorithms, GraphRAG and Causal Graphs, with Graph Guru Amy Hodler

2025-09-1601:03:01

922: AI for Manufacturing and Industry, with Hugo Dozois-Caouette

922: AI for Manufacturing and Industry, with Hugo Dozois-Caouette

2025-09-1228:29

921: NPUs vs GPUs vs CPUs for Local AI Workloads, with Dell’s Ish Shah and Shirish Gupta

921: NPUs vs GPUs vs CPUs for Local AI Workloads, with Dell’s Ish Shah and Shirish Gupta

2025-09-0901:11:29

920: In Case You Missed It in August 2025

920: In Case You Missed It in August 2025

2025-09-0521:57

919: Hopes and Fears of AGI, with All-Time Bestselling ML Author Aurélien Géron

919: Hopes and Fears of AGI, with All-Time Bestselling ML Author Aurélien Géron

2025-09-0201:29:07

918: Multi-Agent Systems with CrewAI

918: Multi-Agent Systems with CrewAI

2025-08-2909:16

917: 8 Steps to Becoming an AI Engineer, with Kirill Eremenko

917: 8 Steps to Becoming an AI Engineer, with Kirill Eremenko

2025-08-2601:14:48

916: The 5 Key GPT-5 Takeaways

916: The 5 Key GPT-5 Takeaways

2025-08-2209:40

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

2025-08-1901:08:58

914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz

914: Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz

2025-08-1525:52

913: LLM Pre-Training and Post-Training 101, with Julien Launay

913: LLM Pre-Training and Post-Training 101, with Julien Launay

2025-08-1201:14:35

912: In Case You Missed It in July 2025

912: In Case You Missed It in July 2025

2025-08-0832:38

911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal

911: The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal

2025-08-0557:47

00:00

00:00

1.0x

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

915: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

Jon Krohn