A Sober Look at Language Model Reasoning
☆94Nov 18, 2025Updated 4 months ago
Alternatives and similar repositories for sober-reasoning
Users that are interested in sober-reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆27Oct 14, 2025Updated 6 months ago
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 5 months ago
- Collaborative retina modelling across datasets and species.☆19Apr 9, 2026Updated last week
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆37Jan 21, 2025Updated last year
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…☆134Jan 31, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- The rule-based evaluation subset and code implementation of Omni-MATH☆27Dec 23, 2024Updated last year
- Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding.☆13Nov 19, 2024Updated last year
- [EMNLP'25 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆68Apr 11, 2025Updated last year
- ☆33Oct 13, 2025Updated 6 months ago
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25☆90Jun 16, 2025Updated 10 months ago
- ☆35May 16, 2025Updated 11 months ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆20Oct 29, 2025Updated 5 months ago
- ☆49Mar 20, 2026Updated 3 weeks ago
- A series of technical report on Slow Thinking with LLM☆764Aug 13, 2025Updated 8 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for "Adaptive Self-improvement LLM Agentic System for ML Library Development" (ICML 2025)☆15Jan 6, 2026Updated 3 months ago
- ☆25Jun 10, 2025Updated 10 months ago
- ☆14Aug 25, 2021Updated 4 years ago
- Official implementation of TBA for async LLM post-training.☆30Nov 5, 2025Updated 5 months ago
- Explore and Control with Adversarial Surprise☆10Jul 20, 2021Updated 4 years ago
- [Preprint] RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments☆202Apr 7, 2026Updated last week
- Automatic evals for LLMs☆585Feb 24, 2026Updated last month
- ☆1,126Jan 10, 2026Updated 3 months ago
- A scalable automated alignment method for large language models. Resources for "Aligning Large Language Models via Self-Steering Optimiza…☆20Nov 21, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆37Jun 5, 2025Updated 10 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆250Sep 12, 2025Updated 7 months ago
- ☆78Jun 28, 2025Updated 9 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆82Dec 25, 2025Updated 3 months ago
- ☆54Feb 12, 2025Updated last year
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆436Mar 20, 2026Updated 3 weeks ago
- [SIGKDD 2024] Rethinking Fair Graph Neural Networks from Re-balancing☆10Jul 15, 2024Updated last year
- Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]☆662Jul 29, 2025Updated 8 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆17Feb 9, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆112Jul 9, 2025Updated 9 months ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆1,241Aug 27, 2025Updated 7 months ago
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. 🧮✨☆274Apr 26, 2024Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- ☆343May 24, 2025Updated 10 months ago
- [AAAI26] Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilitie…☆10Feb 7, 2026Updated 2 months ago
- 🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.☆647Jan 29, 2026Updated 2 months ago