[EMNLP 2025 Main] AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
☆88Jun 10, 2025Updated 9 months ago
Alternatives and similar repositories for AlphaOne
Users that are interested in AlphaOne are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- When Reasoning Meets Its Laws☆36Jan 2, 2026Updated 3 months ago
- [ICLR2025] Official code implementation of Video-UTR: Unhackable Temporal Rewarding for Scalable Video MLLMs☆61Feb 27, 2025Updated last year
- [ICLR 2026] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models☆83Jan 21, 2026Updated 2 months ago
- [NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"☆32Jul 6, 2025Updated 9 months ago
- ☆21Mar 18, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆64Sep 27, 2025Updated 6 months ago
- The official implementation of "PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning" (CVPR 2025)☆29Oct 31, 2025Updated 5 months ago
- [NeurIPS 2025] The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reason…☆154Sep 12, 2025Updated 6 months ago
- [NeurIPS 2025 Spotlight] SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation☆230Jun 30, 2025Updated 9 months ago
- OS for fun☆11May 29, 2021Updated 4 years ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆186Jul 23, 2025Updated 8 months ago
- ☆24Jun 18, 2025Updated 9 months ago
- [CVPR 2025] Official implementation for "Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbre…☆58Jul 5, 2025Updated 9 months ago
- ☆56Jul 7, 2025Updated 9 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- RL with Experience Replay☆56Jul 27, 2025Updated 8 months ago
- Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models☆46Sep 19, 2025Updated 6 months ago
- [RSS 2025] Learning Getting-Up Policies for Real-World Humanoid Robots☆213Apr 14, 2025Updated 11 months ago
- Code for RRL (https://sites.google.com/view/abstractions4rl)☆27Jan 21, 2022Updated 4 years ago
- CVPR 2022 paper☆16Jun 9, 2022Updated 3 years ago
- [ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction☆231Oct 8, 2024Updated last year
- ☆20Sep 27, 2024Updated last year
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆75Sep 8, 2025Updated 7 months ago
- ☆15Jul 31, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learning☆290Jul 15, 2025Updated 8 months ago
- Developer project for getting basic API integrations working in under 5 minutes☆11Jan 30, 2026Updated 2 months ago
- Code for “SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation(ICLR 2025)”☆26Oct 23, 2025Updated 5 months ago
- SimKO: Simple Pass@K Policy Optimization☆28Oct 24, 2025Updated 5 months ago
- ☆20Dec 2, 2024Updated last year
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆139Sep 4, 2025Updated 7 months ago
- fast trainer for educational purposes☆24Mar 31, 2026Updated last week
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆161Jun 26, 2025Updated 9 months ago
- [ICLR 2025🎉] Official implementation for paper "ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy".☆64Nov 3, 2025Updated 5 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [ICLR 2025] Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆52Oct 19, 2025Updated 5 months ago
- ☆12Apr 18, 2025Updated 11 months ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 7 months ago
- ☆12Oct 24, 2023Updated 2 years ago
- [ICLRW'26] EoRA: Fine-tuning-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation☆33Mar 24, 2026Updated 2 weeks ago
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆22Oct 14, 2025Updated 5 months ago
- [ICLR 2026] RoboInter: A Holistic Intermediate Representation Suite Towards Robotic Manipulation☆110Feb 14, 2026Updated last month