zhanghm1995 / Awesome-VARLinks
A curated list of resources focused on Visual AutoRegressive Modeling, makes GPT-style AR models surpass diffusion transformers in image generation.
☆36Updated 4 months ago
Alternatives and similar repositories for Awesome-VAR
Users that are interested in Awesome-VAR are comparing it to the libraries listed below
Sorting:
- ICCV 2025-PerLDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models☆48Updated last week
- Official PyTorch implementation of GeoDiffusion in ICLR 2024 (https://arxiv.org/abs/2306.04607)☆90Updated 6 months ago
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆178Updated this week
- [CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention☆168Updated 4 months ago
- This is the official implementation for ControlVAR.☆116Updated 7 months ago
- Implements VAR+CLIP for text-to-image (T2I) generation☆142Updated 5 months ago
- [CVPR 2025] A Unified Image-Dense Annotation Generation Model for Underwater Scenes☆32Updated 3 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…☆222Updated 2 months ago
- [ICLR 2025] ControlAR: Controllable Image Generation with Autoregressive Models☆280Updated 2 months ago
- ☆38Updated 11 months ago
- ☆135Updated 3 weeks ago
- [CVPR 2025] Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers☆32Updated 2 weeks ago
- This repository is dedicated to Track 2 of the W-CODA 2024 Workshop, "Multimodal Perception and Comprehension of Corner Cases in Autonomo…☆11Updated last year
- Pytorch implementation of GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting☆86Updated 3 months ago
- [IEEE RA-L 2025] Generate Weather with LLM. Code for "WeatherDG: LLM-assisted Procedural Weather Generation for Domain-Generalized Semant…☆38Updated last month
- [CVPR 2025 Highlight🔥] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuni…☆93Updated 2 months ago
- Official Code of IR3D-Bench: Evaluating Vision-Language Model Scene Understanding as Agentic Inverse Rendering☆33Updated 2 weeks ago
- [ICLR 2025] Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving☆40Updated 5 months ago
- “FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with an…☆133Updated 2 months ago
- Code of our CVPR2024 paper - DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data☆58Updated last year
- Official implementation of "Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive" (ICLR 2024)☆55Updated 10 months ago
- ☆27Updated 10 months ago
- [ICRA2025] A dual-branch conditional diffusion model designed to enhance driving scene generation across multiple views and video sequenc…☆33Updated 2 months ago
- ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention (ECCV 2024)☆80Updated 2 months ago
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆120Updated 3 months ago
- A list of works on video generation towards world model☆157Updated this week
- [AAAI 2025] GFlow: Recovering 4D World from Monocular Video☆46Updated 2 months ago
- The first decoder-only multimodal state space model☆92Updated 2 months ago
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆121Updated 7 months ago
- [AAAI 2025] Linear-complexity Visual Sequence Learning with Gated Linear Attention☆110Updated last year