daixiangzi / VAR-CLIP
Implements VAR+CLIP for text-to-image (T2I) generation
☆128Updated last month
Alternatives and similar repositories for VAR-CLIP:
Users that are interested in VAR-CLIP are comparing it to the libraries listed below
- This is the official implementation for ControlVAR.☆99Updated 3 months ago
- High-performance Image Tokenizers for VAR and AR☆213Updated this week
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆286Updated 2 weeks ago
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models☆205Updated last month
- This is a repo to track the latest autoregressive visual generation papers.☆164Updated this week
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆106Updated 3 months ago
- [ICLR2025]☆138Updated last month
- [NeurIPS 2024] The official code of "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers"☆190Updated 5 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆413Updated this week
- [CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆154Updated 2 weeks ago
- STAR: Scale-wise Text-to-image generation via Auto-Regressive representations☆137Updated last month
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆92Updated 2 months ago
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆85Updated 8 months ago
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation☆68Updated last week
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆86Updated 2 weeks ago
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation☆253Updated 2 weeks ago
- [ICLR 2025] Diffusion Feedback Helps CLIP See Better☆266Updated last month
- [CVPR 2025] Open implementation of "RandAR"☆60Updated 2 months ago
- [Neurips 2023 & TPAMI] T2I-CompBench (++) for Compositional Text-to-image Generation Evaluation☆240Updated last month
- [CVPR2025] PAR: Parallelized Autoregressive Visual Generation. https://epiphqny.github.io/PAR-project/☆126Updated 2 months ago
- ☆164Updated last month
- [NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models☆268Updated 3 months ago
- The collection of awesome papers on alignment of diffusion models.☆138Updated 2 weeks ago
- [CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models☆161Updated 5 months ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆84Updated 5 months ago
- [CVPR 2025] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆471Updated last week
- A collection of diffusion models based on FLUX/DiT for image/video generation, editing, reconstruction, inpainting .etc.☆30Updated this week