iamhankai / Forest-of-ThoughtLinks
ICML2025: Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning
☆43Updated 2 months ago
Alternatives and similar repositories for Forest-of-Thought
Users that are interested in Forest-of-Thought are comparing it to the libraries listed below
Sorting:
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆187Updated 4 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆85Updated 3 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆110Updated 2 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆145Updated 6 months ago
- Efficient Agent Training for Computer Use☆116Updated last month
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆112Updated last week
- ☆155Updated 2 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆226Updated 2 months ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆85Updated 6 months ago
- ☆274Updated last month
- ☆149Updated 6 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆228Updated 2 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆195Updated 2 weeks ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆133Updated 3 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆79Updated last month
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆251Updated last month
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models☆143Updated last month
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluations☆121Updated 3 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆244Updated 3 months ago
- ☆205Updated 5 months ago
- Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆201Updated last week
- ReasonFlux Series - A family of LLM post-training algorithms focusing on data selection, reinforcement learning, and inference scaling☆454Updated 2 weeks ago
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆267Updated 5 months ago
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆52Updated last week
- ☆304Updated last month
- ☆77Updated 3 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆79Updated last month
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆130Updated this week
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆106Updated 5 months ago
- ☆318Updated last month