GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 training.
☆334Nov 11, 2025Updated 6 months ago
Alternatives and similar repositories for MegaDLMs
Users that are interested in MegaDLMs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of dLLM-Var☆34Nov 6, 2025Updated 6 months ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆22Oct 29, 2025Updated 7 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆228Nov 6, 2025Updated 6 months ago
- The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…☆46Nov 6, 2025Updated 6 months ago
- Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.☆621May 11, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆64Jan 5, 2026Updated 4 months ago
- [ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.☆506Jan 28, 2026Updated 4 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆84Jan 24, 2024Updated 2 years ago
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆1,002Updated this week
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆33Mar 26, 2026Updated 2 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆3,795Nov 12, 2025Updated 6 months ago
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆114Feb 3, 2026Updated 3 months ago
- [NeurIPS 2025] Encoder-Decoder Diffusion Language Models for Efficient Training and Inference☆42Oct 29, 2025Updated 7 months ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆131May 22, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 🚀 Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support☆521Updated this week
- Code for paper "SPG Sandwiched Policy Gradient for Masked Diffusion Language Models"☆59Oct 29, 2025Updated 7 months ago
- Implementation of "Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models" [NeurIPS 2025]☆81Dec 17, 2025Updated 5 months ago
- [Arxiv] Discrete Diffusion in Large Language and Multimodal Models: A Survey☆379Apr 4, 2026Updated last month
- The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understandi…☆18Aug 7, 2024Updated last year
- The official GitHub repo for the survey paper "A Survey on Diffusion Language Models".☆1,039May 19, 2026Updated last week
- ☆20Oct 12, 2025Updated 7 months ago
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Nov 11, 2024Updated last year
- ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents, NeurIPS 2025☆36Nov 15, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆32Jun 5, 2025Updated 11 months ago
- [ACM-MM 2025 Workshop] More Is Better: A MoE-Based Emotion Recognition Framework with Human Preference Alignment.☆25Nov 25, 2025Updated 6 months ago
- Code for the paper https://arxiv.org/abs/2205.14987v2☆64Apr 18, 2024Updated 2 years ago
- [AAAI-26] Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?☆30Dec 14, 2025Updated 5 months ago
- ☆17Dec 11, 2024Updated last year
- Data recipes and robust infrastructure for training AI agents☆150May 22, 2026Updated last week
- [ICML 2025🔥] ParallelComp: Parallel Long-Context Compressor for Length Extrapolation☆30Jun 16, 2025Updated 11 months ago
- EchoX: Towards Mitigating Acoustic-Semantic Gap via Echo Training for Speech-to-Speech LLMs☆47Sep 19, 2025Updated 8 months ago
- [MM 2022] MM-ALT: A Multimodal Automatic Lyric Transcription System (Oral, Top paper award)☆21Mar 16, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆51Oct 18, 2024Updated last year
- WeDLM: The fastest diffusion language model with standard causal attention and native KV cache compatibility, delivering real speedups ov…☆644Mar 3, 2026Updated 2 months ago
- [ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆390May 31, 2025Updated 11 months ago
- Dream 7B, a large diffusion language model☆1,240Nov 21, 2025Updated 6 months ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- Official Implementation of "Maximum Likelihood Reinforcement Learning (MaxRL)"☆183Updated this week
- [ICLR'26] Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs☆99Jan 26, 2026Updated 4 months ago