hkgc-1/GHPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/hkgc-1/GHPO)

hkgc-1 / GHPO

☆62

Alternatives and similar repositories for GHPO

Users that are interested in GHPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

BaohaoLiao / SAGE
View on GitHub
Self-Hinting Language Models Enhance Reinforcement Learning
☆27Mar 28, 2026Updated 4 months ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 11 months ago
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
DanielSc4 / Dynamic-Activation-Composition
View on GitHub
Materials for "Multi-property Steering of Large Language Models with Dynamic Activation Composition"
☆14Nov 22, 2024Updated last year
multimodal-art-projection / TreePO
View on GitHub
☆65Mar 30, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
yule-BUAA / MergeLLM
View on GitHub
Codes for Merging Large Language Models
☆37Aug 7, 2024Updated last year
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 7 months ago
microsoft / SuperRL
View on GitHub
☆15Sep 8, 2025Updated 10 months ago
liumy2010 / UFT
View on GitHub
UFT: Unifying Supervised and Reinforcement Fine-Tuning
☆31Jun 30, 2025Updated last year
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆461Mar 20, 2026Updated 4 months ago
IcyFish332 / T3RL
View on GitHub
☆48Apr 15, 2026Updated 3 months ago
CLR-Lab / SimKO
View on GitHub
SimKO: Simple Pass@K Policy Optimization
☆31Oct 24, 2025Updated 9 months ago
RLHFlow / Minimal-RL
View on GitHub
☆275May 14, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
open-compass / RePro
View on GitHub
[ICLR 2026] Rectifying LLM Thought From Lens of Optimization
☆15Dec 5, 2025Updated 7 months ago
wutaiqiang / MI
View on GitHub
Official code for paper "Revisiting Model Interpolation for Efficient Reasoning"
☆17Jul 14, 2026Updated 2 weeks ago
shawnli / on-policy-distillation
View on GitHub
Implementation of On-Policy Distillation (GKD) for Language Models - ICLR 2024
☆23Nov 24, 2025Updated 8 months ago
EIT-NLP / Speak-While-Watching
View on GitHub
☆17Mar 1, 2026Updated 4 months ago
LHL3341 / MetaLadder
View on GitHub
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer (EMNLP 2025)
☆12Apr 18, 2025Updated last year
yuleiqin / RAIF
View on GitHub
A Recipe for Building LLM Reasoners to Solve Complex Instructions
☆32Oct 9, 2025Updated 9 months ago
YihongDong / RL-PLUS
View on GitHub
☆27Aug 31, 2025Updated 10 months ago
MikeWangWZHL / PAPO
View on GitHub
Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"
☆153Feb 4, 2026Updated 5 months ago
LuckyyySTA / GOLF
View on GitHub
☆18Mar 16, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Shenzhi-Wang / Beyond-the-80-20-Rule-RLVR
View on GitHub
The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learn…
☆61Jan 5, 2026Updated 6 months ago
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
niuwz / LangTime
View on GitHub
Official implementation for "LangTime: A Language-Guided Unified Model for Time Series Forecasting with Proximal Policy Optimization"
☆15Feb 9, 2026Updated 5 months ago
purbeshmitra / MOTIF
View on GitHub
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs
☆17Jul 6, 2025Updated last year
zhaoxlpku / SubgoalXL
View on GitHub
☆26Aug 23, 2024Updated last year
yichengchen24 / DataChef
View on GitHub
☆25Feb 12, 2026Updated 5 months ago
s-ball-10 / jailbreak_dynamics
View on GitHub
☆25Jun 13, 2024Updated 2 years ago
WooooDyy / MathCritique
View on GitHub
Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".
☆55Nov 29, 2024Updated last year
TsinghuaC3I / Unify-Post-Training
View on GitHub
Towards a Unified View of Large Language Model Post-Training
☆211Sep 8, 2025Updated 10 months ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Su-my / TRAPO
View on GitHub
The official repository for Trust-Region Adaptive Policy Optimization (TRAPO) – a novel hybrid framework designed to enhance large langua…
☆16Mar 2, 2026Updated 4 months ago
UMass-Embodied-AGI / BudgetGuidance
View on GitHub
[ACL'26 Findings] Steering LLM Thinking with Budget Guidance
☆33Feb 19, 2026Updated 5 months ago
HITsz-TMG / ICL-State-Vector
View on GitHub
☆12Jul 4, 2024Updated 2 years ago
heliossun / LaCoT
View on GitHub
[NeurIPS 2025] Official code for paper: Latent Chain-of-Thought for Visual Reasoning
☆36Oct 16, 2025Updated 9 months ago
michaelchen-lab / caft-llm
View on GitHub
Improving large language models with concept-aware fine-tuning (CAFT)
☆29Jan 31, 2026Updated 5 months ago
jefferyZhan / GThinker
View on GitHub
[CVPR 2026] GThinker, Reasoning MLLM, Visual Cues, Visual Rethinking
☆18Mar 9, 2026Updated 4 months ago
yefd / RRAG
View on GitHub
The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…
☆41Dec 6, 2024Updated last year