Papers
arxiv:2510.09988

Unifying Tree Search Algorithm and Reward Design for LLM Reasoning: A Survey

Published on Oct 11
Authors:
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

A unified framework for deliberative tree search in LLMs distinguishes between transient search guidance and durable parametric reward modeling, advancing test-time scaling and self-improvement.

AI-generated summary

Deliberative tree search is a cornerstone of modern Large Language Model (LLM) research, driving the pivot from brute-force scaling toward algorithmic efficiency. This single paradigm unifies two critical frontiers: Test-Time Scaling (TTS), which deploys on-demand computation to solve hard problems, and Self-Improvement, which uses search-generated data to durably enhance model parameters. However, this burgeoning field is fragmented and lacks a common formalism, particularly concerning the ambiguous role of the reward signal -- is it a transient heuristic or a durable learning target? This paper resolves this ambiguity by introducing a unified framework that deconstructs search algorithms into three core components: the Search Mechanism, Reward Formulation, and Transition Function. We establish a formal distinction between transient Search Guidance for TTS and durable Parametric Reward Modeling for Self-Improvement. Building on this formalism, we introduce a component-centric taxonomy, synthesize the state-of-the-art, and chart a research roadmap toward more systematic progress in creating autonomous, self-improving agents.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.09988 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.09988 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.09988 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.