liujch1998/ppo-mcts

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liujch1998/ppo-mcts)

liujch1998 / ppo-mcts

☆21

Alternatives and similar repositories for ppo-mcts

Users that are interested in ppo-mcts are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

luka-group / CoIN
View on GitHub
☆14Jun 11, 2024Updated 2 years ago
rookie-joe / AutoPSV
View on GitHub
☆50Oct 28, 2024Updated last year
genrm-star / genrm-critiques
View on GitHub
GenRM-CoT: Data release for verification rationales
☆68Oct 16, 2024Updated last year
dynamic-lm / interrupt-lrm
View on GitHub
🔥 [ICML 2026] Official implementation of "Are LRMs Interruptible?"
☆18Jun 18, 2026Updated last month
jukofyork / aiassistant
View on GitHub
An AI-powered coding assistant plugin for the Eclipse IDE.
☆14Oct 28, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
zwhe99 / LLM-MT-Eval
View on GitHub
{DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}
☆14Jun 18, 2023Updated 3 years ago
intelligent-control-lab / APEX-MR
View on GitHub
☆24Jan 23, 2026Updated 6 months ago
AndreaCorsini1 / SelfLabelingJobShop
View on GitHub
Self-Labeling the Job Shop Scheduling Problem
☆23Jun 26, 2024Updated 2 years ago
Freder-chen / ReasonGenRM
View on GitHub
A simple implementation of ReasonGenRM.
☆19Apr 21, 2025Updated last year
para-lost / ECHO
View on GitHub
Echo: "Constantly Improving Image Models Need Constantly Improving Benchmarks" (ICLR 2026)
☆20Jan 29, 2026Updated 6 months ago
HKUST-KnowComp / IntentionQA
View on GitHub
Code and data for the paper: IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Large Language Models …
☆12Apr 27, 2024Updated 2 years ago
123000001212 / PoisonedEye
View on GitHub
Code of ICML 2025 paper "PoisonedEye: Knowledge Poisoning Attack on Retrieval-Augmented Generation based Large Vision-Language Models"
☆15Oct 30, 2025Updated 8 months ago
junha-l / dexter
View on GitHub
☆20Jul 22, 2026Updated last week
visual-haystacks / mirage
View on GitHub
🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
☆27Feb 9, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
megagonlabs / holobench
View on GitHub
🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…
☆12Feb 25, 2025Updated last year
hcoxec / soft_h
View on GitHub
soft entropy estimation
☆16May 29, 2026Updated last month
aaronserianni / attention-iou
View on GitHub
[CVPR'25] Attention IoU: Examining Biases in CelebA using Attention Maps
☆13Mar 26, 2025Updated last year
hkust-nlp / mstar
View on GitHub
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆75Jul 13, 2025Updated last year
promotion-kim / TMT
View on GitHub
☆15Dec 10, 2025Updated 7 months ago
NUS-HPC-AI-Lab / InfoGrowth
View on GitHub
Efficient and Online Dataset Growth Algorithm (with cleanness and diversity awareness) to deal with growing web data
☆20Aug 6, 2024Updated last year
VAMPIR-Lab / Interstate.jl
View on GitHub
A lightweight driving simulator, written in Julia.
☆19Sep 25, 2024Updated last year
YuxiXie / MCTS-DPO
View on GitHub
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
☆331Jan 29, 2026Updated 6 months ago
HKUNLP / DiffuSearch
View on GitHub
[ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"
☆39Mar 3, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
whunextgen / LLMindCraft
View on GitHub
Shaping Language Models with Cognitive Insights
☆15Feb 29, 2024Updated 2 years ago
Lylist / 2018blackfriday-salesdata-analysis
View on GitHub
2018黑色星期五销售数据分析及用户画像、购买力预测报告
☆15May 8, 2019Updated 7 years ago
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
SII-MARFT / MARFT
View on GitHub
☆20May 14, 2026Updated 2 months ago
lqtrung1998 / mwp_cot_design
View on GitHub
☆14Oct 11, 2023Updated 2 years ago
zju-SWJ / RLD
View on GitHub
Official implementation for "Knowledge Distillation with Refined Logits".
☆23Aug 26, 2024Updated last year
listen0425 / Safety-Layers
View on GitHub
code space of paper "Safety Layers in Aligned Large Language Models: The Key to LLM Security" (ICLR 2025)
☆25Apr 26, 2025Updated last year
zcaicaros / TBGAT
View on GitHub
Official implementation of paper "Learning Topological Representations with Bidirectional Graph Attention Network for Solving Job Shop Sc…
☆35Jun 13, 2025Updated last year
OpenMOSS / Thus-Spake-Long-Context-LLM
View on GitHub
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
☆62Mar 31, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
grammarly / gaqcorpus
View on GitHub
☆13Nov 21, 2024Updated last year
Evanwu1125 / AutoWebWorld
View on GitHub
☆25Jul 10, 2026Updated 2 weeks ago
jacobfa / Attractor
View on GitHub
☆27May 20, 2026Updated 2 months ago
lmassaron / Gemma-2-2B-IT-GRPO
View on GitHub
Fine-tuning the Google/gemma-2-2b-it model using Generative Reward Post-Optimization (GRPO)
☆15Sep 18, 2025Updated 10 months ago
polixir / morec
View on GitHub
☆10Mar 11, 2024Updated 2 years ago
zwhe99 / FeedbackMT
View on GitHub
Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"
☆22Jun 28, 2024Updated 2 years ago
XianyiCheng / HiDex
View on GitHub
☆13Jun 30, 2023Updated 3 years ago