Gen-Verse / MMaDALinks
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
☆1,109Updated last week
Alternatives and similar repositories for MMaDA
Users that are interested in MMaDA are comparing it to the libraries listed below
Sorting:
- Dream 7B, a large diffusion language model☆764Updated last week
- Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆698Updated 2 months ago
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving stat…☆1,247Updated last week
- An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL☆752Updated this week
- Explore the Multimodal “Aha Moment” on 2B Model☆592Updated 3 months ago
- [CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis☆1,336Updated last month
- Official implementation of UnifiedReward & UnifiedReward-Think☆417Updated last week
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆893Updated 2 months ago
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆603Updated 8 months ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆2,332Updated this week
- ☆1,206Updated this week
- SEED-Voken: A Series of Powerful Visual Tokenizers☆893Updated 4 months ago
- Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation☆767Updated last week
- Next-Token Prediction is All You Need☆2,149Updated 3 months ago
- Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]☆569Updated 3 weeks ago
- Muon is Scalable for LLM Training☆1,077Updated 2 months ago
- Lumina-mGPT 2.0: Stand-alone Autoregressive Image Modeling☆709Updated last month
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆446Updated 5 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆532Updated last month
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆601Updated 2 months ago
- This repo contains the code for 1D tokenizer and generator☆908Updated 3 months ago
- 📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.☆578Updated 2 weeks ago
- [ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,480Updated this week
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,779Updated 10 months ago
- MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning☆654Updated 3 weeks ago
- [ACL 2025 🔥] Rethinking Step-by-step Visual Reasoning in LLMs☆302Updated last month
- ☆789Updated last week
- ☆378Updated 2 weeks ago
- A fork to add multimodal model training to open-r1☆1,306Updated 4 months ago
- Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"☆203Updated last week