zhanghm1995 / Awesome-VARLinks
A curated list of resources focused on Visual AutoRegressive Modeling, makes GPT-style AR models surpass diffusion transformers in image generation.
☆38Updated 10 months ago
Alternatives and similar repositories for Awesome-VAR
Users that are interested in Awesome-VAR are comparing it to the libraries listed below
Sorting:
- ICCV 2025-PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models☆52Updated last week
- Official implementation for What matters for Representation Alignment: Global Information or Spatial Structure?☆174Updated 3 weeks ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆204Updated 5 months ago
- [CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆176Updated 10 months ago
- [NeurIPS 2025 Oral] Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think☆233Updated 3 months ago
- [CVPR 2025] A Unified Image-Dense Annotation Generation Model for Underwater Scenes☆51Updated 9 months ago
- Implements VAR+CLIP for text-to-image (T2I) generation☆147Updated 11 months ago
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models☆316Updated 8 months ago
- This is the official implementation for ControlVAR.☆126Updated last year
- Official PyTorch implementation of GeoDiffusion in ICLR 2024 (https://arxiv.org/abs/2306.04607)☆96Updated 5 months ago
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆163Updated 8 months ago
- [CVPR 2025] Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers☆40Updated 4 months ago
- Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆101Updated 9 months ago
- ☆170Updated 6 months ago
- [CVPR 2024] Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training☆43Updated last year
- The first decoder-only multimodal state space model☆97Updated 7 months ago
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆127Updated last year
- Visual Spatial Tuning☆161Updated this week
- [CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation☆60Updated 6 months ago
- [NeurIPS DB 2025] IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering☆42Updated 2 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…☆241Updated 2 months ago
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆115Updated last year
- [IJCAI 2023] official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation☆34Updated 2 years ago
- ☆54Updated 3 months ago
- [CVPR 2024] Exploiting Diffusion Prior for Generalizable Dense Prediction☆80Updated last year
- Frequency Autoregressive Image Generation with Continuous Tokens☆94Updated 7 months ago
- Official PyTorch codes for "Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation", ECCV2024☆30Updated last year
- Official implementation of "Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation"☆14Updated 9 months ago
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆167Updated 3 weeks ago
- [Arxiv'25] MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization☆54Updated 3 months ago