Scattered Forest Search: Smarter Code Space Exploration with LLM Inference

1Rensselaer Polytechnic Institute, 2Princeton University, 3NEC Laboratories America, 4XAi

SFS (Scattered Forest Search) Demo for Code Generation

Abstract

We propose a novel approach to scaling LLM inference for code generation by viewing the task as a black-box optimization problem over the code space. Drawing inspiration from traditional optimization, we introduce Scattered Forest Search (SFS) to encourage solution diversity while leveraging feedback from code execution tests. Our method features three key components: Scattering, which prompts the LLM to produce a diverse set of improvement directions; Foresting, which expands tree search from multiple random seed solutions; and Scouting, which shares successful or unsuccessful improvement “directions” across search branches. Experiments on standard code generation tasks—HumanEval, MBPP, APPS, CodeContests, and Leetcode—demonstrate significant improvements in pass rates compared to repeated sampling, line search, and naive tree search, while also finding correct solutions faster and with fewer model calls. Our framework highlights the importance of systematic exploration and exploitation strategies when scaling LLM inference for program synthesis.

Method

Visual depiction of the code space
A conceptual 2D illustration of code space. Each point corresponds to a potential code solution.

Our approach treats code generation as a black-box optimization problem in which each generated function or program is an element in the code space. The objective is to find a solution passing all hidden tests. We enhance tree search to balance exploration (Scattering) and exploitation (leveraging partial successes). Specifically:

  • Scattering: Before branching from a parent solution, the LLM proposes multiple textual “directions” for improvement, ensuring children solutions differ significantly and helping to escape local optima.
  • Foresting: We use multiple “seed” solutions and dynamically choose which seeds to expand next. Each seed has its own search tree, reducing the risk of refining a single flawed seed.
  • Scouting: Global insights about promising or unpromising directions are shared across branches so that good directions get reinforced and unproductive directions are minimized.

Altogether, SFS finds correct solutions faster and exploits verification signals (like partial test passes) more effectively than simpler inference approaches.

Prior Methods Visualization
Visualization of prior inference-time methods (repeated sampling, line search, tree search).
Optimization Techniques in Code Space
A high-level illustration of the optimization-inspired approaches (Scattering, Foresting, Scouting) used in SFS.

Key Results

Higher accuracy and faster discovery. On code benchmarks like HumanEval, MBPP, APPS, CodeContests, and Leetcode, SFS surpasses repeated sampling (BoN), line search, and naive tree search. It also scales better with each additional solution attempt, giving a higher probability of finding the correct solution in fewer total LLM calls.

Diverse solutions. By prompting for multiple directions and seeds, SFS avoids generating near-duplicate outputs. The search explores more of the code space, yet still exploits valuable partial feedback about correctness.

Conclusion

In conclusion, Scattered Forest Search (SFS) reframes code generation as a black-box optimization problem over the code space, systematically balancing exploration of new code solutions with exploitation of partial successes. By leveraging simple but powerful techniques—Scattering, Foresting, and Scouting—SFS substantially reduces the number of calls required to find a correct solution and achieves higher success rates across diverse benchmarks.

Our results emphasize that code generation with LLMs benefits greatly from improved inference-time algorithms. By systematically guiding sampling through textual “directions,” ensuring multiple diverse initial seeds, and sharing successful strategies across branches, we show that SFS elevates the performance and versatility of LLMs in program synthesis tasks.