Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆81Dec 25, 2025Updated 2 months ago
Alternatives and similar repositories for KlearReasoner
Users that are interested in KlearReasoner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Vocabulary Parallelism☆25Mar 10, 2025Updated last year
- MUA-RL: MULTI-TURN USER-INTERACTING AGENT REINFORCEMENT LEARNING FOR AGENTIC TOOL USE☆58Nov 5, 2025Updated 4 months ago
- ☆19Jun 14, 2024Updated last year
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 6 months ago
- [ICLR24] Better Neural PDE Solvers Through Data-Free Mesh Movers☆17Mar 20, 2024Updated 2 years ago
- ☆15Dec 20, 2024Updated last year
- RL with Experience Replay☆55Jul 27, 2025Updated 7 months ago
- ☆16Sep 4, 2025Updated 6 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆186Jul 23, 2025Updated 8 months ago
- From Accuracy to Robustness: A Study of Rule- and Model-based Verifiers in Mathematical Reasoning.☆25Oct 7, 2025Updated 5 months ago
- ☆14Apr 16, 2024Updated last year
- ☆31Sep 12, 2025Updated 6 months ago
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.☆16Oct 20, 2025Updated 5 months ago
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆51Sep 4, 2025Updated 6 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆266Aug 12, 2025Updated 7 months ago
- ☆15Nov 7, 2024Updated last year
- Reproducible and flexible LLM evaluations for scientific reasoning.☆26Jul 23, 2025Updated 8 months ago
- Encoder-decoders for translating different chemical formats.☆19Sep 17, 2025Updated 6 months ago
- [COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs☆64Mar 9, 2026Updated 2 weeks ago
- Official code repository of Shuffle-R1☆25Feb 23, 2026Updated last month
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆115Feb 2, 2026Updated last month
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆222Nov 27, 2025Updated 3 months ago
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆48Oct 16, 2025Updated 5 months ago
- A Sober Look at Language Model Reasoning☆94Nov 18, 2025Updated 4 months ago
- ☆33Oct 13, 2025Updated 5 months ago
- [ACL 2024 Findings] The official repo for "ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large …☆24May 29, 2024Updated last year
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆43Oct 31, 2025Updated 4 months ago
- ☆13May 23, 2025Updated 10 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated last month
- [ICLR 2026] BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs☆17May 21, 2025Updated 10 months ago
- ✨✨ [ICLR 2026] R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning☆282May 9, 2025Updated 10 months ago
- ☆25Aug 19, 2025Updated 7 months ago
- [ICLR2026] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆30Feb 4, 2026Updated last month
- Kinetics: Rethinking Test-Time Scaling Laws☆86Jul 11, 2025Updated 8 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- ☆35Mar 12, 2025Updated last year
- [ICLR 2026] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆41May 20, 2025Updated 10 months ago
- [ML4H'25] m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning in Large Language Models☆48Dec 21, 2025Updated 3 months ago