Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆51Nov 9, 2024Updated last year
Alternatives and similar repositories for SC-MCTS
Users that are interested in SC-MCTS are comparing it to the libraries listed below
Sorting:
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆123May 6, 2025Updated 10 months ago
- Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training☆53Dec 13, 2025Updated 2 months ago
- LCA-on-the-line (ICML 2024 Oral)☆13Feb 13, 2025Updated last year
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated last year
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆692Jan 20, 2025Updated last year
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆17Feb 21, 2025Updated last year
- ☆32Jun 5, 2025Updated 9 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- The code implementation of MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models…☆40Feb 5, 2024Updated 2 years ago
- ☆70Jun 18, 2025Updated 8 months ago
- ☆18Jun 3, 2024Updated last year
- ☆19Mar 25, 2025Updated 11 months ago
- The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”☆17Feb 26, 2024Updated 2 years ago
- Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆18Oct 7, 2025Updated 5 months ago
- ☆342Jun 5, 2025Updated 9 months ago
- ☆17Dec 21, 2023Updated 2 years ago
- This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.☆17Sep 13, 2024Updated last year
- ☆14Mar 10, 2020Updated 6 years ago
- ☆19Nov 13, 2023Updated 2 years ago
- ☆28Oct 2, 2025Updated 5 months ago
- [EMNLP '23] Discriminator-Guided Chain-of-Thought Reasoning☆50Oct 11, 2024Updated last year
- Estimating hardware and cloud costs of LLMs and transformer projects☆21Jan 15, 2026Updated last month
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- Reproducing R1 for Code with Reliable Rewards☆292May 5, 2025Updated 10 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆206Mar 4, 2025Updated last year
- NexAU (AU for Agent Universe), a general-purpose agent framework for building intelligent agents with tool capabilities.☆49Mar 2, 2026Updated last week
- Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence☆58Nov 11, 2025Updated 3 months ago
- ☆21Jul 25, 2025Updated 7 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆95Nov 13, 2025Updated 3 months ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- ☆19Nov 6, 2023Updated 2 years ago
- ☆20Oct 29, 2018Updated 7 years ago
- ☆23Jul 5, 2024Updated last year
- Large Reasoning Models☆807Dec 3, 2024Updated last year
- Recipes to train the self-rewarding reasoning LLMs.☆231Mar 2, 2025Updated last year
- Foundry is an interactive, real-time Javascript interface that allows flash teams to be assembled by anyone and tracked in real time.☆29May 27, 2017Updated 8 years ago
- Source code for the paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data"☆20Feb 24, 2024Updated 2 years ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆55Nov 29, 2024Updated last year
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆568May 6, 2025Updated 10 months ago