☆41Jun 19, 2024Updated last year
Alternatives and similar repositories for llm-mcts
Users that are interested in llm-mcts are comparing it to the libraries listed below
Sorting:
- A Python reimplementation/extension of "Planning with Large Language Models for Code Generation" (https://arxiv.org/abs/2303.05510)☆18Dec 1, 2023Updated 2 years ago
- ☆21Jul 25, 2025Updated 7 months ago
- DialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings☆19Nov 24, 2021Updated 4 years ago
- ☆28May 29, 2024Updated last year
- This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.☆329Jan 29, 2026Updated last month
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- Securade.ai Sentinel - A monitoring and surveillance application that enables visual Q&A and video captioning for existing CCTV cameras.☆27Apr 6, 2025Updated 11 months ago
- [ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling☆18Jun 6, 2024Updated last year
- Representation Learning in RL☆13Jun 1, 2022Updated 3 years ago
- ☆158Mar 18, 2023Updated 2 years ago
- ☆20Mar 1, 2023Updated 3 years ago
- Albert for Conversational Question Answering Challenge☆22Jun 12, 2023Updated 2 years ago
- ☆45Dec 12, 2024Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆56Jul 11, 2025Updated 7 months ago
- ☆56Nov 6, 2024Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- We design models that generate conversational responses for factual questions using expert answer phrases from Question Answering systems…☆21Jul 2, 2020Updated 5 years ago
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆25Jan 23, 2024Updated 2 years ago
- [NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling bett…☆298Nov 16, 2024Updated last year
- ☆1,033Dec 17, 2024Updated last year
- ☆102Dec 7, 2023Updated 2 years ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆68Apr 26, 2025Updated 10 months ago
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆190Mar 7, 2025Updated last year
- ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)☆692Jan 20, 2025Updated last year
- repo of "SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics" (ASE 22) https://oadoi.org/10.1145/3551349.3556926☆27Mar 4, 2024Updated 2 years ago
- Package to optimize Adversarial Attacks against (Large) Language Models with Varied Objectives☆70Feb 22, 2024Updated 2 years ago
- Public Inflection Benchmarks☆68Mar 6, 2024Updated 2 years ago
- ☆120Aug 28, 2024Updated last year
- Explore what LLMs are really leanring over SFT☆28Mar 30, 2024Updated last year
- Code for Quiet-STaR☆741Aug 21, 2024Updated last year
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆33May 9, 2024Updated last year
- ☆29Dec 28, 2025Updated 2 months ago
- Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)☆28Dec 19, 2023Updated 2 years ago
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆35Apr 8, 2023Updated 2 years ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆209Jul 19, 2024Updated last year
- ☆32Jun 5, 2025Updated 9 months ago
- [ACL 2025, Main Conference, Oral] Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆30Aug 2, 2024Updated last year