#207 - GPT 4.1, Gemini 2.5 Flash, Ironwood, Claude Max
Update: 2025-04-18
1
Description
Our 207th episode with a summary and discussion of last week's big AI news!
Recorded on 04/14/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
Join our Discord here! https://discord.gg/nTyezGSKwP
In this episode:
- OpenAI introduces GPT-4.1 with optimized coding and instruction-following capabilities, featuring variants like GPT-4.1 Mini and Nano, and a million-token context window.
- Concerns arise as OpenAI reduces resources for safety testing, sparking internal and external criticisms.
- XAI's newly launched API for Grok 3 showcases significant capabilities comparable to other leading models.
- Meta faces allegations of aiding China in AI development for business advantages, with potential compliances and public scrutiny looming.
Timestamps + Links:
- Tools & Apps
- (00:03:13 ) OpenAI’s new GPT-4.1 AI models focus on coding
- (00:08:12 ) ChatGPT will now remember your old conversations
- (00:11:16 ) Google’s newest Gemini AI model focuses on efficiency
- (00:14:27 ) Elon Musk’s AI company, xAI, launches an API for Grok 3
- (00:18:35 ) Canva is now in the coding and spreadsheet business
- (00:20:31 ) Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark
- Applications & Business
- (00:25:46 ) Ironwood: The first Google TPU for the age of inference
- (00:34:15 ) Anthropic rolls out a $200-per-month Claude subscription
- (00:37:17 ) OpenAI co-founder Ilya Sutskever’s Safe Superintelligence reportedly valued at $32B
- (00:40:20 ) Mira Murati’s AI startup gains prominent ex-OpenAI advisers
- (00:42:52 ) Hugging Face buys a humanoid robotics startup
- (00:44:58 ) Stargate developer Crusoe could spend $3.5 billion on a Texas data center. Most of it will be tax-free.
- Projects & Open Source
- (00:48:14 ) OpenAI Open Sources BrowseComp: A New Benchmark for Measuring the Ability for AI Agents to Browse the Web
- Research & Advancements
- (00:56:09 ) Sample, Don't Search: Rethinking Test-Time Alignment for Language Models
- (01:03:32 ) Concise Reasoning via Reinforcement Learning
- (01:09:37 ) Going beyond open data – increasing transparency and trust in language models with OLMoTrace
- (01:15:34 ) Independent evaluations of Grok-3 and Grok-3 mini on our suite of benchmarks
- Policy & Safety
- (01:17:58 ) OpenAI countersues Elon Musk, calls for enjoinment from ‘further unlawful and unfair action’
- (01:24:33 ) OpenAI slashes AI model safety testing time
- (01:27:55 ) Ex-OpenAI staffers file amicus brief opposing the company’s for-profit transition
- (01:32:25 ) Access to future AI models in OpenAI’s API may require a verified ID
- (01:34:53 ) Meta whistleblower claims tech giant built $18 billion business by aiding China in AI race and undermining U.S. national security
Comments
In Channel