Scattered Forest Search: Smarter Code Space Exploration with LLM Inference

¹Rensselaer Polytechnic Institute, ²Princeton University, ³NEC Laboratories America, ⁴XAi

Abstract

We propose a novel approach to scaling LLM inference for code generation by viewing the task as a black-box optimization problem over the code space. Drawing inspiration from traditional optimization, we introduce Scattered Forest Search (SFS) to encourage solution diversity while leveraging feedback from code execution tests. Our method features three key components: Scattering, which prompts the LLM to produce a diverse set of improvement directions; Foresting, which expands tree search from multiple random seed solutions; and Scouting, which shares successful or unsuccessful improvement “directions” across search branches. Experiments on standard code generation tasks—HumanEval, MBPP, APPS, CodeContests, and Leetcode—demonstrate significant improvements in pass rates compared to repeated sampling, line search, and naive tree search, while also finding correct solutions faster and with fewer model calls. Our framework highlights the importance of systematic exploration and exploitation strategies when scaling LLM inference for program synthesis.

Method

A conceptual 2D illustration of code space. Each point corresponds to a potential code solution.

Our approach treats code generation as a black-box optimization problem in which each generated function or program is an element in the code space. The objective is to find a solution passing all hidden tests. We enhance tree search to balance exploration (Scattering) and exploitation (leveraging partial successes). Specifically:

Scattering: Before branching from a parent solution, the LLM proposes multiple textual “directions” for improvement, ensuring children solutions differ significantly and helping to escape local optima.
Foresting: We use multiple “seed” solutions and dynamically choose which seeds to expand next. Each seed has its own search tree, reducing the risk of refining a single flawed seed.
Scouting: Global insights about promising or unpromising directions are shared across branches so that good directions get reinforced and unproductive directions are minimized.

Altogether, SFS finds correct solutions faster and exploits verification signals (like partial test passes) more effectively than simpler inference approaches.

Visualization of prior inference-time methods (repeated sampling, line search, tree search).

A high-level illustration of the optimization-inspired approaches (Scattering, Foresting, Scouting) used in SFS.

Key Results

Scaling with different methods — Scaling performance across different search methods.

Token Scaling — Analysis of solution discovery vs. tokens consumed.

High Budget Scaling — Continuing improvements at higher generation budgets.

Scaling on APPS dataset — Scaling results on the APPS benchmark.

Scaling on CodeContests — Scaling results on the CodeContests benchmark.

Scaling across different LLMs — Performance across multiple LLM models.

Confusion Matrix — Confusion matrix of self-generated validation tests used by SFS.

Scaling with groundtruth tests — Scaling performance when partial ground truth tests are available.

Scaling on HumanEval — Scaling results on the HumanEval benchmark.

Scaling on Leetcode — Scaling results on Leetcode tasks.

Scaling on MBPP — Scaling results on the MBPP benchmark.

Higher accuracy and faster discovery. On code benchmarks like HumanEval, MBPP, APPS, CodeContests, and Leetcode, SFS surpasses repeated sampling (BoN), line search, and naive tree search. It also scales better with each additional solution attempt, giving a higher probability of finding the correct solution in fewer total LLM calls.

Diverse solutions. By prompting for multiple directions and seeds, SFS avoids generating near-duplicate outputs. The search explores more of the code space, yet still exploits valuable partial feedback about correctness.

Conclusion

In conclusion, Scattered Forest Search (SFS) reframes code generation as a black-box optimization problem over the code space, systematically balancing exploration of new code solutions with exploitation of partial successes. By leveraging simple but powerful techniques—Scattering, Foresting, and Scouting—SFS substantially reduces the number of calls required to find a correct solution and achieves higher success rates across diverse benchmarks.

Our results emphasize that code generation with LLMs benefits greatly from improved inference-time algorithms. By systematically guiding sampling through textual “directions,” ensuring multiple diverse initial seeds, and sharing successful strategies across branches, we show that SFS elevates the performance and versatility of LLMs in program synthesis tasks.

Scattered Forest Search: Smarter Code Space Exploration with LLM Inference

SFS (Scattered Forest Search) Demo for Code Generation

Abstract

Method

Key Results

Conclusion