What is recursive reasoning in AI models?

Recursive reasoning means repeatedly applying the same model weights over evolving internal states at inference time, instead of relying only on a deeper one-shot feed-forward stack. In practice, the model refines latent states across multiple steps before producing an answer.

What is the main difference between HRM and TRM?

HRM uses two recurrent modules at different frequencies (high-level and low-level), while TRM simplifies this into a single tiny shared network with recursive refinement over separate latent/output states. TRM keeps the recursive core but removes much of HRM’s architectural complexity.

Why were HRM and TRM notable on ARC-AGI?

Both papers reported strong ARC-AGI performance with far fewer parameters and less training data than frontier LLMs typically use, suggesting that recursive computation can be a powerful axis for reasoning tasks that are hard to compress into one forward pass.

Is recursion the same as chain-of-thought prompting?

No. Chain-of-thought primarily recurses in token space (external text traces). HRM/TRM-style recursion primarily updates continuous latent states internally. That distinction changes efficiency, error propagation, and what the model can represent during reasoning.

Does this replace large language models?

Not yet. HRM/TRM are task-focused reasoning systems, not broad general-purpose chat models. The practical frontier is likely hybrid systems that combine large pretrained world models with stronger latent recursive reasoning modules.

Recursive Reasoning in 2026: HRM, TRM, and Why | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

Recursive Reasoning in 2026: HRM, TRM, and Why | explainx.ai Blog | explainx.ai

Most AI scaling conversations still default to one strategy: bigger models, more data, longer context. HRM and TRM added a different axis in 2025: more recursive computation at inference time without proportionally increasing parameter count.

This post summarizes the key ideas from recent HRM/TRM research and the Decoded discussion transcript you shared, then maps those ideas to practical model-design choices.

TL;DR

Question	Short answer
What changed?	HRM/TRM showed that small models can gain strong reasoning behavior via recursive latent refinement loops.
Why was it interesting?	Reported ARC-style results were strong relative to model size and training data.
Core mechanism	Reuse the same weights repeatedly over internal states (`z`, `z_low`, or equivalent) at inference and training.
HRM key idea	Two-timescale recursion (high-level + low-level modules).
TRM key idea	Simplify to one tiny shared network and keep recursive refinement + deep supervision.
Big takeaway	Inference-time recursion is a meaningful compute axis, not just parameter scaling.

Primary sources:

HRM paper: arXiv:2506.21734
TRM paper (“Less is More”): arXiv:2510.04871
ARC benchmark context: ARC Prize

Why Recursion Re-entered the Conversation

A transformer forward pass is highly parallel and efficient for training, but many reasoning tasks are effectively multi-step algorithms. If the task is hard to compress into one pass, performance can bottleneck even when the model is large.

In the transcript, this is framed as a gap between:

Token-space iteration (chain-of-thought and tool calls)
Latent-space iteration (internal recursive state updates)

That distinction matters. Token-space traces are useful, but they are constrained by discrete outputs and supervision artifacts. Latent recursion can keep iterative computation inside a continuous state space.

HRM in One Page

Hierarchical Reasoning Model (HRM) proposes two interacting recurrent modules:

A high-level module for slower abstract updates
A low-level module for faster local computation

At a high level, training repeatedly:

Initializes internal states
Runs nested recursion loops
Applies a supervised objective
Repeats refinement

Reported results in the paper include strong performance on reasoning-heavy tasks (including ARC-style settings) with a relatively small parameter budget and limited training samples.

Reference: Wang et al., 2025

What TRM Kept, What TRM Removed

Tiny Recursive Model (TRM) keeps the core recursive refinement intuition but simplifies architecture and training design.

From the paper’s framing:

Replace dual-network hierarchy with a single tiny shared network
Keep recursive latent/output refinement
Use deep supervision-style training across refinement iterations

The paper reports that this simplified setup outperformed HRM on key ARC-AGI metrics while using fewer parameters.

Reference: Jolicoeur-Martineau, 2025

Chain-of-Thought vs Latent Recursion

A useful way to reason about the difference:

Approach	Iteration medium	Typical failure mode
Chain-of-thought	Tokens (external text)	Verbose traces, brittle decomposition, inherited token errors
Tool-use loops	Tokens + external API calls	Bounded by tool availability and prior knowledge
HRM/TRM recursion	Continuous latent state	Training stability and optimization details become central

This does not make chain-of-thought obsolete. It reframes it as one recursion interface, not the only one.

Why ARC-Style Tasks Fit This Direction

ARC-style problems emphasize abstraction and stepwise transformation. They are often hard to solve via a single direct mapping from input to output.

Recursive latent refinement is naturally aligned with these tasks because it allows:

Iterative hypothesis updates
Intermediate state correction before final output
More compute depth without proportional parameter growth

That is the core reason these papers attracted attention: not just scoreboards, but a different compute strategy.

Engineering Implications

If you are building reasoning systems, these papers suggest a practical design checklist:

Separate model capacity from compute depth Capacity (parameters) and iterative depth (recursion steps) should be tuned independently.
Treat recursion loops as first-class hyperparameters Refinement steps, supervision depth, and state-reset behavior can matter as much as width/depth.
Benchmark for algorithmic generalization, not only text fluency Include tasks where single-pass pattern matching fails.
Expect hybrid architectures General-purpose pretrained models plus compact recursive reasoning heads/modules is a plausible near-term direction.

Limits and Open Questions

Important caveats:

HRM/TRM are not drop-in replacements for broad conversational LLM products.
Reported gains are strongest on specific reasoning benchmarks; transfer breadth remains an open question.
Training dynamics (especially truncated backprop choices and recursion schedules) are still under active study.
Benchmark-specific optimization risk always exists; cross-domain validation is essential.

Practical Positioning in 2026

The most realistic interpretation is not “small recursive models replace frontier LLMs.”

It is: recursive inference is a complementary scaling law. The field can continue scaling pretrained world models while adding stronger latent recursive computation where algorithmic reasoning is the bottleneck.

That matches where many labs are heading across agent systems and reasoning stacks: combine broad priors with targeted iterative computation.

Source Notes

This article is based on:

Your provided Decoded transcript content
HRM primary paper: arXiv:2506.21734
TRM primary paper: arXiv:2510.04871
ARC benchmark site: arcprize.org

Paper results and benchmark standings can change with revised evaluations, replications, and new benchmark versions. Verify against the latest arXiv revisions and ARC Prize updates.

Recursive Reasoning in 2026: HRM, TRM, and Why Inference-Time Recursion Matters

Related posts

AI Advice Kills "I Don't Know": Cognitive Surrender in a PsyArXiv Study

Did Fable 5 Disprove the Jacobian Conjecture? Alpoge Thread Explained

Anthropic Commits $10M CAD to Canadian AI Research — Amii, Mila, Vector & 8 Partners

TL;DR

Why Recursion Re-entered the Conversation

HRM in One Page

What TRM Kept, What TRM Removed

Chain-of-Thought vs Latent Recursion

Why ARC-Style Tasks Fit This Direction

Engineering Implications

Limits and Open Questions

Practical Positioning in 2026

Source Notes

Related posts

AI Advice Kills "I Don't Know": Cognitive Surrender in a PsyArXiv Study

Did Fable 5 Disprove the Jacobian Conjecture? Alpoge Thread Explained

Anthropic Commits $10M CAD to Canadian AI Research — Amii, Mila, Vector & 8 Partners

TL;DR

Why Recursion Re-entered the Conversation

HRM in One Page

What TRM Kept, What TRM Removed

Chain-of-Thought vs Latent Recursion

Why ARC-Style Tasks Fit This Direction

Engineering Implications

Limits and Open Questions

Practical Positioning in 2026

Related explainx.ai Reads

Source Notes