reasoning

In the context of Large Language Models, reasoning is the ability to tackle more complex problems step-by-step.

The concept of reasoning became popular when OpenAI announced o1 on the 12th of September 2024, highlighting their capabilities in tackling complex problems in science, math, coding etc. Few months later, in January 2025, DeepSeek released their R1 model that competed and exceeded the performance of the proprietary o1 model. The great thing is that they made it openly available, sharing a blueprint on how to train such a model.

After a Large Language Model underwent the typical LLM training pipeline is completed, there are 3 different approaches to developing and improving its reasoning capabilities:

inference-time compute scaling (also called inference-compute scaling or test-time scaling)
reinforcement learning (or RL)
distillation