suu990901/KlearReasoner

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/suu990901/KlearReasoner)

suu990901 / KlearReasoner

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

☆82

Alternatives and similar repositories for KlearReasoner

Users that are interested in KlearReasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Kwai-Klear / CE-GPPO
View on GitHub
CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning
☆16Jan 23, 2026Updated 5 months ago
Kwai-Klear / Klear1.0
View on GitHub
☆19Sep 7, 2025Updated 10 months ago
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆59Jul 27, 2025Updated 11 months ago
Rainier-rq / verl-if
View on GitHub
Official implementation of the paper "Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following"
☆40Jan 11, 2026Updated 6 months ago
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
stepfun-ai / StepFun-Prover-Preview
View on GitHub
Large language models designed for formal theorem proving through tool-integrated reasoning.
☆33Aug 13, 2025Updated 11 months ago
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
AndreHe02 / rewarding-unlikely-release
View on GitHub
☆15Jun 10, 2025Updated last year
multimodal-art-projection / TreePO
View on GitHub
☆65Mar 30, 2026Updated 3 months ago
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
OpenIXCLab / CODA
View on GitHub
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
☆37Aug 28, 2025Updated 10 months ago
ByteDance-BandAI / ReportBench
View on GitHub
A comprehensive benchmark for evaluating deep research agents on academic survey tasks
☆56Sep 4, 2025Updated 10 months ago
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
suu990901 / LLaMA-MiLe-Loss
View on GitHub
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
☆68Feb 18, 2025Updated last year
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
kaiwenzha / RL-Tango
View on GitHub
[NeurIPS 2025] RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning
☆57Oct 23, 2025Updated 8 months ago
qhjqhj00 / MetaAgent
View on GitHub
MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning
☆47Sep 3, 2025Updated 10 months ago
zzwkk / MUA-RL
View on GitHub
MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE
☆65Nov 5, 2025Updated 8 months ago
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆165Mar 2, 2026Updated 4 months ago
inclusionAI / GroveMoE
View on GitHub
☆24Aug 20, 2025Updated 11 months ago
Leey21 / A-Data-Centric-Study
View on GitHub
☆18Mar 2, 2026Updated 4 months ago
zhyang2226 / AR-Lopti
View on GitHub
[ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs
☆46May 20, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
lichengliu03 / unary-feedback
View on GitHub
☆44Mar 31, 2026Updated 3 months ago
TsinghuaC3I / SSRL
View on GitHub
SSRL: Self-Search Reinforcement Learning
☆210Aug 20, 2025Updated 11 months ago
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
DerrickYLJ / LessIsMore
View on GitHub
[ICML 2026] Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning
☆34Sep 12, 2025Updated 10 months ago
hkgc-1 / GHPO
View on GitHub
☆62Jul 21, 2025Updated last year
thu-coai / SPaR
View on GitHub
☆47Jun 11, 2025Updated last year
kxfan2002 / SophiaVL-R1
View on GitHub
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
☆94Aug 8, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
ByteDance-Seed / WideSearch
View on GitHub
WideSearch: Benchmarking Agentic Broad Info-Seeking
☆148Oct 9, 2025Updated 9 months ago
bethgelab / sober-reasoning
View on GitHub
A Sober Look at Language Model Reasoning
☆92Nov 18, 2025Updated 8 months ago
TsinghuaC3I / Unify-Post-Training
View on GitHub
Towards a Unified View of Large Language Model Post-Training
☆211Sep 8, 2025Updated 10 months ago
PRIME-RL / P1
View on GitHub
P1: Mastering Physics Olympiads with Reinforcement Learning
☆89Dec 29, 2025Updated 6 months ago
CriticBench / CriticBench
View on GitHub
[ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning
☆31Mar 5, 2024Updated 2 years ago
LLM360 / MegaMath
View on GitHub
[COLM 2025] An Open Math Pre-trainng Dataset with 370B Tokens.
☆110Apr 4, 2025Updated last year
axon-rl / gem
View on GitHub
A Gym for Agentic LLMs
☆502Jan 21, 2026Updated 6 months ago