convergence-ai / lm2
Official repo of paper LM2
☆39Updated 2 months ago
Alternatives and similar repositories for lm2
Users that are interested in lm2 are comparing it to the libraries listed below
Sorting:
- Code for "Reasoning to Learn from Latent Thoughts"☆94Updated last month
- ☆78Updated 8 months ago
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- Tiny re-implementation of MDM in style of LLaDA and nano-gpt speedrun☆49Updated 2 months ago
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆77Updated 2 weeks ago
- ☆114Updated 2 months ago
- ☆65Updated 3 weeks ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆30Updated 2 months ago
- ☆17Updated last week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆57Updated 3 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 8 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆45Updated 3 weeks ago
- EvaByte: Efficient Byte-level Language Models at Scale☆92Updated 3 weeks ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆33Updated last month
- official repository for “Reinforcement Learning for Reasoning in Large Language Models with One Training Example”☆143Updated last week
- The official implementation of Self-Exploring Language Models (SELM)☆64Updated 11 months ago
- Process Reward Models That Think☆30Updated this week
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆81Updated last month
- Repo for "Z1: Efficient Test-time Scaling with Code"☆58Updated last month
- ☆25Updated 3 months ago
- ☆17Updated 4 months ago
- Replicating O1 inference-time scaling laws☆85Updated 5 months ago
- ☆31Updated 4 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆56Updated 2 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆29Updated last month
- ☆97Updated 10 months ago
- ☆109Updated 3 months ago
- This repo is based on https://github.com/jiaweizzhao/GaLore☆27Updated 7 months ago
- ☆25Updated last year
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆68Updated 3 months ago