JinjieNi / MegaDLMsLinks
GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.
☆89Updated this week
Alternatives and similar repositories for MegaDLMs
Users that are interested in MegaDLMs are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆176Updated 4 months ago
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆85Updated last month
- ☆100Updated last month
- ☆281Updated 2 weeks ago
- Geometric-Mean Policy Optimization☆89Updated 3 weeks ago
- ☆108Updated last year
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]☆192Updated last week
- Esoteric Language Models☆104Updated last month
- [EMNLP'2025 Industry] Repo for "Z1: Efficient Test-time Scaling with Code"☆66Updated 6 months ago
- MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning☆107Updated last week
- repo for paper https://arxiv.org/abs/2504.13837☆203Updated 4 months ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆296Updated 2 weeks ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆78Updated last month
- ☆103Updated 4 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆240Updated last month
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆179Updated 3 months ago
- ☆61Updated 2 weeks ago
- The open-source code of MetaStone-S1.☆107Updated 3 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆164Updated 5 months ago
- ☆335Updated 3 months ago
- The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''☆96Updated 2 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.☆264Updated last week
- REverse-Engineered Reasoning for Open-Ended Generation☆78Updated last month
- The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…☆37Updated this week
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated 9 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆145Updated this week
- 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…☆187Updated 3 weeks ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 8 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆326Updated 5 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆35Updated 8 months ago