JinjieNi / OpenMoE2Links
The official repo for "OpenMoE 2: Sparse Diffusion Language Models".
☆51Updated 2 weeks ago
Alternatives and similar repositories for OpenMoE2
Users that are interested in OpenMoE2 are comparing it to the libraries listed below
Sorting:
- Easy and Efficient dLLM Fine-Tuning☆190Updated 3 weeks ago
- Official Repository of Native Parallel Reasoner☆92Updated 3 weeks ago
- ☆126Updated this week
- The official github repo for "Diffusion Language Models are Super Data Learners".☆215Updated 2 months ago
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆87Updated 2 months ago
- Official Repo for Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics☆69Updated 2 weeks ago
- This is the offical repository of InfiniteVL☆68Updated 3 weeks ago
- ☆109Updated 3 months ago
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆218Updated 3 weeks ago
- ☆191Updated 3 weeks ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆380Updated 3 weeks ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆202Updated 2 months ago
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆67Updated 4 months ago
- QeRL enables RL for 32B LLMs on a single H100 GPU.☆469Updated last month
- ☆84Updated 9 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆96Updated last week
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 11 months ago
- Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference☆224Updated 3 months ago
- Official JAX implementation of End-to-End Test-Time Training for Long Context☆214Updated last week
- VideoNSA: Native Sparse Attention Scales Video Understanding☆77Updated last month
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆118Updated 2 months ago
- [NeurIPS 2025 Oral] Exploring Diffusion Transformer Designs via Grafting☆69Updated this week
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆79Updated last year
- [Arxiv 2025] SparseD: Sparse Attention for Diffusion Language Models☆53Updated 3 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆34Updated 3 months ago
- ☆63Updated 6 months ago
- Esoteric Language Models☆108Updated last month
- ☆35Updated 9 months ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆128Updated 7 months ago
- The official repo of VideoAgentTrek☆39Updated 2 months ago