ziqipang / RandAR
Open implementation of "RandAR"
β37Updated last week
Alternatives and similar repositories for RandAR:
Users that are interested in RandAR are comparing it to the libraries listed below
- This is the official implementation for ControlVAR.β68Updated last week
- Liquid: Language Models are Scalable Multi-modal Generatorsβ23Updated this week
- π₯ [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"β30Updated 6 months ago
- β37Updated last year
- [ICLR 2024] Official implementation of the paper "Toss: High-quality text-guided novel view synthesis from a single image"β20Updated 7 months ago
- Implements VAR+CLIP for text-to-image (T2I) generationβ94Updated 2 weeks ago
- Official implementation of "Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive" (ICLR 2024)β52Updated 3 months ago
- The repository contains the official implementation of "Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation"β21Updated 3 weeks ago
- [NIPS24] Official Implementation of Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentationβ15Updated last month
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".β20Updated 2 months ago
- CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficientβ73Updated 2 weeks ago
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"β78Updated 2 months ago
- Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation (ICCV 2023)β63Updated last year
- β21Updated last week
- β16Updated 2 weeks ago
- Code of our CVPR2024 paper - DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Dataβ45Updated 8 months ago
- Sora Generates Videos with Stunning Geometrical Consistencyβ47Updated 8 months ago
- DiG: Scalable and Efficient Diffusion Models with Gated Linear Attentionβ116Updated 3 weeks ago
- XQ-GANπ: An Open-source Image Tokenization Framework for Autoregressive Generationβ149Updated last week
- EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generationβ83Updated last month
- β32Updated last month
- Can 3D Vision-Language Models Truly Understand Natural Language?β21Updated 8 months ago
- β15Updated last year
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"β30Updated 7 months ago
- Official implementation of PARIS3D (Accepted to ECCV 2024).β20Updated 2 months ago
- β58Updated last year
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"β38Updated 6 months ago
- [NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understandingβ66Updated 2 weeks ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generationβ32Updated 3 months ago