kuleshov-group / bd3lms
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
☆439Updated this week
Alternatives and similar repositories for bd3lms:
Users that are interested in bd3lms are comparing it to the libraries listed below
- This repo contains the code for 1D tokenizer and generator☆794Updated last week
- HART: Efficient Visual Generation with Hybrid Autoregressive Transformer☆482Updated 5 months ago
- Scaling Diffusion Transformers with Mixture of Experts☆304Updated 6 months ago
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆894Updated 2 weeks ago
- SEED-Voken: A Series of Powerful Visual Tokenizers☆856Updated last month
- Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI☆1,006Updated 2 weeks ago
- code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"☆791Updated this week
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆209Updated 3 weeks ago
- Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraini…☆551Updated 7 months ago
- Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch☆335Updated 2 months ago
- Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training☆255Updated last month
- Implementation of Autoregressive Diffusion in Pytorch☆366Updated 4 months ago
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆255Updated 2 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆445Updated this week
- [ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)☆539Updated last year
- [ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters☆541Updated last month
- Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation☆1,639Updated 7 months ago
- [ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation☆753Updated 6 months ago
- [CVPR 2025] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆595Updated this week
- Simple and Effective Masked Diffusion Language Model☆351Updated 3 weeks ago
- Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"☆800Updated last year
- Muon optimizer: +>30% sample efficiency with <3% wallclock overhead☆539Updated last week
- Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"☆147Updated 3 months ago
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆403Updated 2 months ago
- [ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.☆1,314Updated this week
- Official implementation of Inductive Moment Matching☆414Updated 3 weeks ago
- Official PyTorch implementation for "Large Language Diffusion Models"☆1,350Updated 2 weeks ago
- ☆422Updated 3 months ago
- Implementation of MagViT2 Tokenizer in Pytorch☆597Updated 2 months ago
- [ICML 2024] CLLMs: Consistency Large Language Models☆388Updated 4 months ago