Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆51Nov 9, 2024Updated last year
Alternatives and similar repositories for SC-MCTS
Users that are interested in SC-MCTS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆54Dec 13, 2025Updated 3 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆123May 6, 2025Updated 11 months ago
- LCA-on-the-line (ICML 2024 Oral)☆14Feb 13, 2025Updated last year
- ☆70Jun 18, 2025Updated 9 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆698Jan 20, 2025Updated last year
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆31Mar 30, 2026Updated last week
- ☆17Oct 31, 2023Updated 2 years ago
- ☆341Jun 5, 2025Updated 10 months ago
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆60Nov 11, 2025Updated 4 months ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- ☆33Jun 5, 2025Updated 10 months ago
- Reproducing R1 for Code with Reliable Rewards☆302May 5, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Large Reasoning Models☆805Dec 3, 2024Updated last year
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆55Nov 29, 2024Updated last year
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Feb 26, 2024Updated 2 years ago
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- official repo for the paper "Learning From Mistakes Makes LLM Better Reasoner"☆60Dec 20, 2023Updated 2 years ago
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆18Feb 21, 2025Updated last year
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆31Jan 27, 2026Updated 2 months ago
- ☆23Jul 5, 2024Updated last year
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆206Mar 4, 2025Updated last year
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension.☆69May 31, 2024Updated last year
- Estimating hardware and cloud costs of LLMs and transformer projects☆21Apr 1, 2026Updated last week
- [COLM 2025: 1st Workshop on the Application of LLM Explainability to Reasoning and Planning] Latent Chain-of-Thought? Decoding the Depth-…☆17Oct 4, 2025Updated 6 months ago
- Recipes to train the self-rewarding reasoning LLMs.☆231Mar 2, 2025Updated last year
- Data and codes for EMNLP 2022 paper "CDConv: A Benchmark for Contradiction Detection in Chinese Conversations"☆13May 8, 2023Updated 2 years ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆189May 20, 2025Updated 10 months ago
- ☆28Feb 10, 2025Updated last year
- This repository collects various works that reproduce DeepSeek R1, as well as works related to DeepSeek R1 and the DeepSeek series.☆19Apr 27, 2025Updated 11 months ago
- ☆19Mar 25, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official Repository of "Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Ste…☆27Mar 9, 2026Updated last month
- ☆971Jan 23, 2025Updated last year
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆568May 6, 2025Updated 11 months ago
- Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Le…☆10Dec 25, 2025Updated 3 months ago
- ☆10Jun 11, 2025Updated 9 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆192Mar 20, 2025Updated last year
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 9 months ago