We propose a novel approach to scaling LLM inference for code generation by viewing the task as a black-box optimization problem over the code space. Drawing inspiration from traditional optimization, we introduce Scattered Forest Search (SFS) to encourage solution diversity while leveraging feedback from code execution tests. Our method features three key components: Scattering, which prompts the LLM to produce a diverse set of improvement directions; Foresting, which expands tree search from multiple random seed solutions; and Scouting, which shares successful or unsuccessful improvement “directions” across search branches. Experiments on standard code generation tasks—HumanEval, MBPP, APPS, CodeContests, and Leetcode—demonstrate significant improvements in pass rates compared to repeated sampling, line search, and naive tree search, while also finding correct solutions faster and with fewer model calls. Our framework highlights the importance of systematic exploration and exploitation strategies when scaling LLM inference for program synthesis.
Our approach treats code generation as a black-box optimization problem in which each generated function or program is an element in the code space. The objective is to find a solution passing all hidden tests. We enhance tree search to balance exploration (Scattering) and exploitation (leveraging partial successes). Specifically:
Altogether, SFS finds correct solutions faster and exploits verification signals (like partial test passes) more effectively than simpler inference approaches.
Higher accuracy and faster discovery. On code benchmarks like HumanEval, MBPP, APPS, CodeContests, and Leetcode, SFS surpasses repeated sampling (BoN), line search, and naive tree search. It also scales better with each additional solution attempt, giving a higher probability of finding the correct solution in fewer total LLM calls.
Diverse solutions. By prompting for multiple directions and seeds, SFS avoids generating near-duplicate outputs. The search explores more of the code space, yet still exploits valuable partial feedback about correctness.
In conclusion, Scattered Forest Search (SFS) reframes code generation as a black-box optimization problem over the code space, systematically balancing exploration of new code solutions with exploitation of partial successes. By leveraging simple but powerful techniques—Scattering, Foresting, and Scouting—SFS substantially reduces the number of calls required to find a correct solution and achieves higher success rates across diverse benchmarks.
Our results emphasize that code generation with LLMs benefits greatly from improved inference-time algorithms. By systematically guiding sampling through textual “directions,” ensuring multiple diverse initial seeds, and sharing successful strategies across branches, we show that SFS elevates the performance and versatility of LLMs in program synthesis tasks.