☆97Dec 16, 2024Updated last year
Alternatives and similar repositories for mcts-llm
Users that are interested in mcts-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Monte Carlo Tree Search Self-Refine (MCTSr)☆22Jul 6, 2024Updated last year
- ☆11Jul 21, 2024Updated last year
- ☆131Jun 18, 2024Updated last year
- Toy implementation of Strawberry☆33Sep 24, 2024Updated last year
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆54Jun 6, 2025Updated 10 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆702Jan 20, 2025Updated last year
- ☆1,034Dec 17, 2024Updated last year
- ☆32Oct 2, 2024Updated last year
- Resources regarding evML (edge verified machine learning)☆22Jan 4, 2025Updated last year
- HRED VHRED VHCR for Multi-Turn Dialogue Systems☆43Dec 16, 2019Updated 6 years ago
- OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models☆1,843Jan 17, 2025Updated last year
- pip install continualcode☆40Feb 10, 2026Updated 2 months ago
- Source code for "Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation"☆18Aug 31, 2019Updated 6 years ago
- Automated neural architecture search algorithms implemented in PyTorch and Autogluon toolkit.☆12Apr 17, 2020Updated 6 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆970Jan 23, 2025Updated last year
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆594Dec 9, 2024Updated last year
- 💻 Terminal-Agent with Human-in-the-Loop Learning☆39Jan 16, 2026Updated 3 months ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bett…☆300Nov 16, 2024Updated last year
- Dataset Pinocchio for paper "Towards Understanding Factual Knowledge of Large Language Models" accepted by ICLR 2024 (Spotlight)☆12Mar 13, 2024Updated 2 years ago
- Official Implementation of "Probing Language Models for Pre-training Data Detection"☆20Dec 4, 2024Updated last year
- ☆11Apr 4, 2018Updated 8 years ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated 2 years ago
- ☆42Nov 7, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆48Feb 26, 2025Updated last year
- O1 Replication Journey☆1,999Jan 14, 2025Updated last year
- A library for advanced large language model reasoning☆2,343Jun 10, 2025Updated 10 months ago
- ☆341Jun 5, 2025Updated 10 months ago
- ☆23Dec 8, 2022Updated 3 years ago
- ☆17Oct 9, 2022Updated 3 years ago
- ☆16Sep 5, 2023Updated 2 years ago
- ☆16Oct 16, 2023Updated 2 years ago
- An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…☆9,417Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆269Jul 8, 2025Updated 9 months ago
- Targeted Data Generation with Large Language Models☆20Jun 25, 2024Updated last year
- ☆552Jan 2, 2025Updated last year
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆130Jun 11, 2025Updated 10 months ago
- ☆1,346Nov 21, 2024Updated last year
- The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learn…☆51Jan 5, 2026Updated 3 months ago
- ☆11Apr 11, 2019Updated 7 years ago