JinjieNi / OpenMoE2Links
The official repo for "OpenMoE 2: Sparse Diffusion Language Models".
☆46Updated 3 weeks ago
Alternatives and similar repositories for OpenMoE2
Users that are interested in OpenMoE2 are comparing it to the libraries listed below
Sorting:
- Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasonin…☆62Updated 2 months ago
- ☆103Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 10 months ago
- Geometric-Mean Policy Optimization☆94Updated last week
- ☆73Updated last week
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆95Updated 3 weeks ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆205Updated 3 weeks ago
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆327Updated last week
- Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference☆202Updated 2 months ago
- Holistic Evaluation of Multimodal LLMs on Spatial Intelligence☆42Updated this week
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆121Updated 6 months ago
- VideoNSA: Native Sparse Attention Scales Video Understanding☆61Updated 2 weeks ago
- ☆62Updated 4 months ago
- Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding☆174Updated last month
- SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention☆142Updated 2 weeks ago
- [NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…☆59Updated 3 weeks ago
- Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give…☆185Updated last month
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆99Updated 3 months ago
- Easy and Efficient dLLM Fine-Tuning☆76Updated this week
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆32Updated 3 months ago
- The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink…☆108Updated 2 months ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆77Updated 4 months ago
- ☆35Updated 7 months ago
- ☆61Updated 4 months ago
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆212Updated this week
- ☆32Updated 4 months ago
- Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…☆76Updated last week
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆329Updated 6 months ago
- Open-Pandora: On-the-fly Control Video Generation☆35Updated last year
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆27Updated 4 months ago