yu-rp / DimpleLinks
Dimple, the first Discrete Diffusion Multimodal Large Language Model
☆60Updated last week
Alternatives and similar repositories for Dimple
Users that are interested in Dimple are comparing it to the libraries listed below
Sorting:
- VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆38Updated last week
- This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"☆47Updated 2 months ago
- ☆111Updated last week
- ☆74Updated 2 weeks ago
- Data distillation benchmark☆64Updated this week
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆102Updated 2 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆31Updated 3 months ago
- Adapting LLaMA Decoder to Vision Transformer☆28Updated last year
- Autoregressive Image Generation with Randomized Parallel Decoding☆63Updated 2 months ago
- ✈️ Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆67Updated 2 months ago
- ☆78Updated 2 months ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆104Updated 10 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆124Updated 4 months ago
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆24Updated 2 months ago
- [ICLR 2025] Source code for paper "A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegr…☆75Updated 5 months ago
- The official implementation of Recurrent Diffusion for Large-Scale Parameter Generation.☆55Updated 3 months ago
- A Collection of Papers on Diffusion Language Models☆60Updated this week
- Code for CVPR 2024 Oral "Neural Lineage"☆17Updated 11 months ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Updated 7 months ago
- [CVPR 2025] PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models☆40Updated 3 months ago
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.☆61Updated 4 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆86Updated 7 months ago
- Official Repository of paper: Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing☆53Updated last week
- [CVPR 2025] Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training☆45Updated 2 months ago
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆62Updated 3 months ago
- Vico: Compositional Video Generation as Flow Equalization☆58Updated 6 months ago
- Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆117Updated 2 weeks ago
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆69Updated 7 months ago
- [Preprint 2025] Thinkless: LLM Learns When to Think☆125Updated this week
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆44Updated 3 months ago