tyfeld / MMaDA-ParallelLinks
Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"
☆269Updated 2 weeks ago
Alternatives and similar repositories for MMaDA-Parallel
Users that are interested in MMaDA-Parallel are comparing it to the libraries listed below
Sorting:
- Official PyTorch implementation of TokenSet.☆127Updated 8 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆207Updated 3 weeks ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆79Updated 6 months ago
- Official implementation of "Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs".☆94Updated 3 weeks ago
- [ICML 2025] This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆143Updated last year
- The open-source code of MetaStone-S1.☆107Updated 4 months ago
- UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, a…☆131Updated 8 months ago
- Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!☆120Updated 9 months ago
- 🦾 EvalGIM (pronounced as "EvalGym") is an evaluation library for generative image models. It enables easy-to-use, reproducible automatic…☆88Updated 11 months ago
- The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.☆53Updated last year
- [MTI-LLM@NeurIPS 2025] Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆137Updated 4 months ago
- Official implementation for SSDD Single-Step Diffusion Decoder for Efficient Image Tokenization.☆46Updated 3 weeks ago
- [ACL2025 Oral & Award] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible☆109Updated 3 months ago
- ☆137Updated 3 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆43Updated last year
- Inference-time scaling of diffusion-based image and video generation models.☆172Updated 5 months ago
- Official implementation of Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (NeurIPS 2025)☆43Updated last week
- Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"☆75Updated 3 months ago
- An open source implementation of CLIP (With TULIP Support)☆163Updated 6 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 7 months ago
- ☆48Updated last week
- [Preprint] Efficient Generative Model Training via Embedded Representation Warmup☆36Updated last month
- Code release for "LLMs can see and hear without any training"☆454Updated 6 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation, arXiv 2024☆66Updated last month
- Vision Language Models are Biased☆101Updated 5 months ago
- Pivotal Token Search☆131Updated 4 months ago
- Official codes of "Monet: Reasoning in Latent Visual Space Beyond Image and Language"☆31Updated last week
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆109Updated last month
- ☆105Updated 5 months ago
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆79Updated 11 months ago