TergelMunkhbat / concise-reasoningLinks
Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models
☆41Updated 5 months ago
Alternatives and similar repositories for concise-reasoning
Users that are interested in concise-reasoning are comparing it to the libraries listed below
Sorting:
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆106Updated 3 months ago
- ☆35Updated 4 months ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 4 months ago
- ☆49Updated 7 months ago
- Process Reward Models That Think☆55Updated 3 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆121Updated last year
- ☆98Updated last month
- [EMNLP 2025 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆64Updated 5 months ago
- ☆50Updated 4 months ago
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning☆53Updated 4 months ago
- ☆23Updated last year
- ☆40Updated 4 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆82Updated 6 months ago
- SSRL: Self-Search Reinforcement Learning☆145Updated last month
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆106Updated 2 months ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆101Updated 2 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆109Updated last year
- ☆23Updated 9 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆104Updated 4 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆122Updated 2 weeks ago
- Verifiers for LLM Reinforcement Learning☆74Updated 5 months ago
- ☆82Updated 3 months ago
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆32Updated 3 weeks ago
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]☆173Updated 3 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆104Updated 5 months ago
- ☆20Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated last year
- [ACL 2025] Knowledge Unlearning for Large Language Models☆43Updated 3 weeks ago
- [EMNLP 2025] The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"☆100Updated last month
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆174Updated 3 months ago