DiscoverBest AI papers explainedBase models know how to reason, thinking models learn when
Base models know how to reason, thinking models learn when

Base models know how to reason, thinking models learn when

Update: 2025-10-11
Share

Description

This paper argues that thinking language models (LLMs that reason step-by-step) do not acquire entirely new capabilities during post-training but rather learn when to deploy pre-existing reasoning mechanisms latent in their base counterparts. The authors use an unsupervised clustering methodology via Sparse Autoencoders (SAEs) to derive an interpretable taxonomy of distinct reasoning behaviors, such as numeric computation and planning next steps. They then implement a hybrid model that uses the base model for generation but is guided by the thinking model's activation patterns via steering vectors to activate specific reasoning behaviors. This hybrid approach successfully recovered up to 91% of the performance gap between base and thinking models on reasoning benchmarks like MATH500 while steering only a small fraction of tokens, supporting the idea that the primary benefit of complex training is teaching efficient mechanism deployment.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Base models know how to reason, thinking models learn when

Base models know how to reason, thinking models learn when

Enoch H. Kang