convergence-ai / lm2Links

Official repo of paper LM2

☆41

Alternatives and similar repositories for lm2

Users that are interested in lm2 are comparing it to the libraries listed below

Sorting:

efficientscaling / Z1
Repo for "Z1: Efficient Test-time Scaling with Code"
☆63Updated 3 months ago
LAMDASZ-ML / Self-Backtracking
☆47Updated 5 months ago
SalesforceAIResearch / LaTRO
☆118Updated 5 months ago
RobertCsordas / moeut
☆83Updated 11 months ago
mukhal / ThinkPRM
Process Reward Models That Think
☆47Updated last month
sail-sg / SkyLadder
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆33Updated last week
s-sahoo / Eso-LMs
Esoteric Language Models
☆89Updated last week
shenao-zhang / SELM
The official implementation of Self-Exploring Language Models (SELM)
☆64Updated last year
OpenMOSS / Lorsa
☆24Updated last month
Parallel-Reasoning / APR
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆117Updated 3 months ago
ScalingIntelligence / large_language_monkeys
☆101Updated 10 months ago
NathanGodey / qfilters
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆34Updated 4 months ago
Yu-Fangxu / FoR
[ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples
☆103Updated last week
SynthLabsAI / big-math
A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
☆59Updated 5 months ago
sunblaze-ucb / reasoning_ladder
☆34Updated 2 months ago
ryoungj / BoLT
Code for "Reasoning to Learn from Latent Thoughts"
☆114Updated 4 months ago
eric-ai-lab / Soft-Thinking
Official implementation of the paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
☆200Updated last week
lucidrains / coconut-pytorch
Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch
☆178Updated last month
Gen-Verse / CURE
Open-Source LLM Coders with Co-Evolving Reinforcement Learning
☆103Updated 2 weeks ago
THU-KEG / Agentic-Reward-Modeling
[ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
☆99Updated last month
complex-reasoning / RPG
The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)
☆35Updated last week
hkust-nlp / PreSelect
[ICML 2025] Predictive Data Selection: The Data That Predicts Is the Data That Teaches
☆53Updated 5 months ago
spiral-rl / spiral
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
☆126Updated last week
sail-sg / VeriFree
Reinforcing General Reasoning without Verifiers
☆76Updated last month
katiekang1998 / reasoning_generalization
☆34Updated 6 months ago
matchten / LoRA-Models-for-SAEs
Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"
☆12Updated 4 months ago
ZihanWang314 / CoE
Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models
☆219Updated last month
waterhorse1 / Natural-language-RL
Natural Language Reinforcement Learning
☆92Updated this week
LLM360 / Reasoning360
A repo for open research on building large reasoning models
☆84Updated this week
vsubramaniam851 / multiagent-ft
☆212Updated 5 months ago