Listen Top Shows Blog

Large models on CPUs

Large models on CPUs

Update: 2023-05-02

2

Share

Description

Model sizes are crazy these days with billions and billions of parameters. As Mark Kurtz explains in this episode, this makes inference slow and expensive despite the fact that up to 90%+ of the parameters don’t influence the outputs at all.

Mark helps us understand all of the practicalities and progress that is being made in model optimization and CPU inference, including the increasing opportunities to run LLMs and other Generative AI models on commodity hardware.

Join the discussion

Changelog++ members save 1 minute on this episode because they made the ads disappear. Join today!

Sponsors:

Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Featuring:

Mark Kurtz – LinkedIn, X
Daniel Whitenack – Website, GitHub, X

Show Notes:

Something missing or broken? PRs welcome!

Comments

In Channel

Beyond chatbots: Agents that tackle your SOPs

Beyond chatbots: Agents that tackle your SOPs

2025-12-1745:53

The AI engineer skills gap

The AI engineer skills gap

2025-12-1045:33

Technical advances in document understanding

Technical advances in document understanding

2025-12-0249:18

Chris on AI, autonomous swarming, home automation and Rust!

Chris on AI, autonomous swarming, home automation and Rust!

2025-11-2601:37:09

Beyond note-taking with Fireflies

Beyond note-taking with Fireflies

2025-11-1948:59

Autonomous Vehicle Research at Waymo

Autonomous Vehicle Research at Waymo

2025-11-1352:08

Are we in an AI bubble?

Are we in an AI bubble?

2025-11-1049:41

While loops with tool calls

While loops with tool calls

2025-10-3044:45

Tiny Recursive Networks

Tiny Recursive Networks

2025-10-2448:23

Dealing with increasingly complicated agents

Dealing with increasingly complicated agents

2025-10-1654:56

The impact of AI on the workforce: A state-level case study

The impact of AI on the workforce: A state-level case study

2025-10-0944:04

We've all done RAG, now what?

We've all done RAG, now what?

2025-09-2943:35

Creating a private AI assistant in Thunderbird

Creating a private AI assistant in Thunderbird

2025-09-2353:08

Cracking the code of failed AI pilots

Cracking the code of failed AI pilots

2025-09-1146:44

GenAI risks and global adoption

GenAI risks and global adoption

2025-08-2743:20

Inside America’s AI Action Plan

Inside America’s AI Action Plan

2025-08-1943:52

Confident, strategic AI leadership

Confident, strategic AI leadership

2025-08-1247:40

Educating a data-literate generation

Educating a data-literate generation

2025-08-0844:41

Workforce dynamics in an AI-assisted world

Workforce dynamics in an AI-assisted world

2025-08-0144:06

Reimagining actuarial science with AI

Reimagining actuarial science with AI

2025-07-2540:59

00:00

00:00

x

Large models on CPUs

Large models on CPUs

Practical AI LLC