shunzh/mcts-for-llm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shunzh/mcts-for-llm)

shunzh / mcts-for-llm

This is a pip package implementing Reinforcement Learning algorithms in non-stationary environments supported by the OpenAI Gym toolkit.

☆16

Alternatives and similar repositories for mcts-for-llm

Users that are interested in mcts-for-llm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

google-research-datasets / QuoteSum
View on GitHub
QuoteSum is a textual QA dataset containing Semi-Extractive Multi-source Question Answering (SEMQA) examples written by humans, based on …
☆13Mar 25, 2024Updated 2 years ago
rmshin / llm-mcts
View on GitHub
☆40Jun 19, 2024Updated 2 years ago
sade-adrien / SteloCoder
View on GitHub
☆16Dec 21, 2023Updated 2 years ago
OSU-NLP-Group / llm-planning-eval
View on GitHub
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Feb 23, 2024Updated 2 years ago
TNAS-DCS / TNAS-DCS
View on GitHub
☆13Aug 9, 2022Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
shunzh / Code-AI-Tree-Search
View on GitHub
☆118Jul 17, 2024Updated 2 years ago
csitfun / ConTRoL-dataset
View on GitHub
Dataset for AAAI paper "Natural Language Inference in Context - Investigating Contextual Reasoning over Long Texts"
☆11Nov 18, 2022Updated 3 years ago
vinayakaraju46 / ROS-for-Apple-Silicon
View on GitHub
☆11Apr 28, 2026Updated 3 months ago
terrierteam / pyterrier_adaptive
View on GitHub
☆18Jun 16, 2026Updated last month
csmile-1006 / DEAS-Isaac-GR00T
View on GitHub
DEAS + Isaac-GR00T + RoboCasa
☆20Nov 22, 2025Updated 8 months ago
zzshou / amr-data-augmentation
View on GitHub
Code for our paper "AMR-DA: Data augmentation by abstract meaning representation" in ACL 2022
☆13May 17, 2022Updated 4 years ago
andrewimpellitteri / llm_poli_compass
View on GitHub
A program to automate testing open source LLMs for their political compass scores
☆11Nov 28, 2023Updated 2 years ago
JunMa11 / CVPR-MedSegFMCompetition
View on GitHub
Foundation Models for Biomedical Image Segmentation
☆19Jun 3, 2025Updated last year
hbin0701 / Pred-Sent
View on GitHub
Official Implementation of the Paper "Let's Predict Sentence by Sentence"
☆15Dec 20, 2025Updated 7 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
carriex / lfqa_eval
View on GitHub
ACL 2023 paper "A Critical Evaluation of Evaluations for Long-form Question Answering"
☆21Mar 22, 2024Updated 2 years ago
kliakhnovich / smmr
View on GitHub
☆17Nov 17, 2025Updated 8 months ago
webis-de / ir_axioms
View on GitHub
↕️ Intuitive axiomatic retrieval experimentation.
☆31Jul 21, 2026Updated last week
Alex-Mathai-98 / Optical-Flow-GANs
View on GitHub
Optical Flow Prediction using GANs
☆12Nov 7, 2019Updated 6 years ago
pwnhyo / T-MAP
View on GitHub
☆18Mar 25, 2026Updated 4 months ago
Rocketknight1 / minimal_lczero
View on GitHub
A minimal reproduction of LCZero's training code, for ease of experimentation and benchmarking
☆14Mar 4, 2024Updated 2 years ago
CenterForPeaceAndSecurityStudies / ICBEdataset
View on GitHub
☆17Jun 16, 2024Updated 2 years ago
NineAbyss / S2R
View on GitHub
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆77Apr 22, 2025Updated last year
qiangning / StructTempRel-EMNLP17
View on GitHub
☆16Mar 9, 2018Updated 8 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
Helloworld10011 / Adversarial-Reasoning
View on GitHub
A new algorithm that formulates jailbreaking as a reasoning problem.
☆26Jul 2, 2025Updated last year
sarrouti / HealthVer
View on GitHub
☆20Feb 3, 2022Updated 4 years ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
CownowAn / DaSS
View on GitHub
Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)
☆26Mar 18, 2024Updated 2 years ago
lengyueit / gpt-mini
View on GitHub
OpenAI GPT的简单复现
☆20Nov 29, 2024Updated last year
budzianowski / opengvl
View on GitHub
Open GVL
☆24Dec 1, 2025Updated 7 months ago
amazon-science / summary-reference-revision
View on GitHub
☆19Apr 10, 2024Updated 2 years ago
sinameraji / multion-reddit
View on GitHub
using multion to find all the commenters under a given reddit post, and DMing a message to them.
☆16Jul 21, 2024Updated 2 years ago
LLM-Integrity-Guard / JailMine
View on GitHub
☆20Jul 22, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gipplab / pdf-benchmark
View on GitHub
A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents
☆32Dec 8, 2022Updated 3 years ago
mengzaiqiao / awesome-natural-language-reasoning
View on GitHub
A collection of research papers related to Natural Language Reasoning
☆10May 27, 2022Updated 4 years ago
john1226966735 / HAKT
View on GitHub
Hierarchical Attention Network based Explainable Knowledge Tracing
☆10May 18, 2022Updated 4 years ago
technion-cs-nlp / llm-arithmetic-heuristics
View on GitHub
☆27Jun 9, 2026Updated last month
terrierteam / pyterrier_rag
View on GitHub
☆29Jul 13, 2026Updated 2 weeks ago
Fasko / Hand-Gesture-Recognition
View on GitHub
A software project utilizing computer vision and machine learning techniques which can recognize 9 unique hand gestures.
☆25Nov 24, 2019Updated 6 years ago
microsoft / TraceCodegen
View on GitHub
☆27Jun 12, 2023Updated 3 years ago