multimodal-art-projection/TreePO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/multimodal-art-projection/TreePO)

multimodal-art-projection / TreePO

☆65

Alternatives and similar repositories for TreePO

Users that are interested in TreePO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆59Jul 27, 2025Updated 11 months ago
xiaohangt / wd1
View on GitHub
Official Implementation of wd1
☆32Sep 25, 2025Updated 10 months ago
TIGER-AI-Lab / Hierarchical-Reasoner
View on GitHub
Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning [ICLR26]
☆64Apr 11, 2026Updated 3 months ago
multimodal-art-projection / IV-Bench
View on GitHub
☆14Apr 23, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 4 months ago
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
Hesse73 / RLVR-Directions
View on GitHub
Source Code for our ICLR'26 paper
☆17Feb 22, 2026Updated 5 months ago
AlphaLab-USTC / LRM-plans-CoT
View on GitHub
[NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"
☆31Jul 6, 2025Updated last year
hkust-nlp / deepsearch-tts
View on GitHub
Pushing Test-Time Scaling Limits of Deep Search with Asymmetric Verification
☆21Oct 8, 2025Updated 9 months ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
abdelfattah-lab / SplitReason
View on GitHub
☆20Mar 18, 2026Updated 4 months ago
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 7 months ago
yangzhch6 / DARS
View on GitHub
The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)
☆24Feb 4, 2026Updated 5 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
chtmp223 / suri
View on GitHub
Suri: Multi-constraint instruction following for long-form text generation [EMNLP’24]
☆27Oct 3, 2025Updated 9 months ago
bigai-nlco / CREAM
View on GitHub
[NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding
☆22Oct 10, 2024Updated last year
lime-RL / DCPO
View on GitHub
DCPO: Dynamic Adaptive Clipping for RL
☆49Apr 1, 2026Updated 3 months ago
open-compass / RePro
View on GitHub
[ICLR 2026] Rectifying LLM Thought From Lens of Optimization
☆15Dec 5, 2025Updated 7 months ago
RUCBM / DeepCritic
View on GitHub
Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"
☆41Jun 24, 2025Updated last year
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
KOR-Bench / KOR-Bench
View on GitHub
☆19Nov 12, 2024Updated last year
hkust-nlp / RL-Verifier-Robustness
View on GitHub
From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.
☆24Oct 7, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
horseee / CoT-Valve
View on GitHub
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
☆91Feb 14, 2025Updated last year
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
hkgc-1 / GHPO
View on GitHub
☆62Jul 21, 2025Updated last year
AMAP-ML / Tree-GRPO
View on GitHub
[ICLR 2026] Tree Search for LLM Agent Reinforcement Learning
☆387Jan 26, 2026Updated 5 months ago
lzhxmu / CPPO
View on GitHub
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)
☆181Nov 4, 2025Updated 8 months ago
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
uservan / speculative_thinking
View on GitHub
☆34Oct 13, 2025Updated 9 months ago
thu-coai / SPaR
View on GitHub
☆47Jun 11, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
BRZ911 / ViTCoT
View on GitHub
[ACM MM 2025] ViTCoT: Video-Text Interleaved Chain-of-Thought for Boosting Video Understanding in Large Language Models
☆18Jul 15, 2025Updated last year
zjunlp / LightThinker
View on GitHub
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆165Jun 22, 2026Updated last month
Zanette-Labs / efficient-reasoning
View on GitHub
☆75Apr 13, 2025Updated last year
emmyqin / iw_sft
View on GitHub
☆28Jul 18, 2025Updated last year
AIFrameResearch / SPO
View on GitHub
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
☆55Sep 19, 2025Updated 10 months ago
THU-KEG / AdaptThink
View on GitHub
☆186Dec 5, 2025Updated 7 months ago
zxiangx / LC-R1
View on GitHub
Code for paper: Optimizing Length Compression in Large Reasoning Models
☆29Oct 20, 2025Updated 9 months ago