bethgelab / sober-reasoningView external linksLinks
A Sober Look at Language Model Reasoning
☆92Nov 18, 2025Updated 2 months ago
Alternatives and similar repositories for sober-reasoning
Users that are interested in sober-reasoning are comparing it to the libraries listed below
Sorting:
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- Collaborative retina modelling across datasets and species.☆17Feb 5, 2026Updated last week
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 3 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆37Jan 21, 2025Updated last year
- ☆52Feb 12, 2025Updated last year
- ☆25Jun 10, 2025Updated 8 months ago
- ☆35May 16, 2025Updated 8 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated 11 months ago
- ☆32Oct 13, 2025Updated 4 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- ☆17Aug 1, 2025Updated 6 months ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆134Jan 31, 2026Updated last week
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆68Apr 11, 2025Updated 10 months ago
- ☆14Apr 14, 2025Updated 9 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆81Dec 25, 2025Updated last month
- ☆49Aug 14, 2025Updated 6 months ago
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆72Feb 25, 2025Updated 11 months ago
- ☆75Jun 28, 2025Updated 7 months ago
- The rule-based evaluation subset and code implementation of Omni-MATH☆26Dec 23, 2024Updated last year
- ☆16Sep 4, 2025Updated 5 months ago
- LaunchPad is a light-weighted Slurm job launcher designed for hyper-parameter search.☆11Aug 2, 2024Updated last year
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆20Oct 29, 2025Updated 3 months ago
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated last month
- ☆10Apr 23, 2025Updated 9 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Updated this week
- ☆1,088Jan 10, 2026Updated last month
- Automatic evals for LLMs☆579Dec 23, 2025Updated last month
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆413Oct 4, 2025Updated 4 months ago
- ☆813Jun 9, 2025Updated 8 months ago
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion (ACL 2025)☆35Jul 16, 2025Updated 6 months ago
- ☆13Jan 22, 2025Updated last year
- Explore and Control with Adversarial Surprise☆10Jul 20, 2021Updated 4 years ago
- This is an implementation of the paper "Are We Done with Object-Centric Learning?"☆12Sep 11, 2025Updated 5 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 8 months ago
- ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL (ICLR 2025 Pytorch Code)☆17May 15, 2025Updated 8 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆88Jun 16, 2025Updated 7 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆283Sep 25, 2025Updated 4 months ago
- ☆352Jul 29, 2025Updated 6 months ago