JCZ404 / Awesome-Visual-Autoregressive
Curated list of recent visual autoregressive (VAR) modeling works
☆30Updated last month
Alternatives and similar repositories for Awesome-Visual-Autoregressive:
Users that are interested in Awesome-Visual-Autoregressive are comparing it to the libraries listed below
- Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation☆69Updated last week
- ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning☆28Updated 2 weeks ago
- Frequency Autoregressive Image Generation with Continuous Tokens☆56Updated last month
- ReNeg: Learning Negative Embedding with Reward Guidance☆31Updated 3 months ago
- Official implementation of "STAR: Scale-wise Text-to-image generation via Auto-Regressive representations"☆30Updated last month
- [CVPR 2025 (Oral)] Open implementation of "RandAR"☆118Updated last month
- [Neurips 2024] Video Diffusion Models are Training-free Motion Interpreter and Controller☆40Updated last week
- Official code for MotionBench (CVPR 2025)☆34Updated last month
- Official Implementation of VideoDPO☆92Updated 3 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆69Updated last month
- Official implementation of LaVin-DiT☆31Updated 2 months ago
- ICCV2023-Diffusion-Papers☆109Updated last year
- PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT☆73Updated this week
- ☆39Updated last year
- FQGAN: Factorized Visual Tokenization and Generation☆48Updated 3 weeks ago
- VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆20Updated last month
- A collection of vision foundation models unifying understanding and generation.☆51Updated 3 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆15Updated last week
- ☆13Updated 2 months ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆42Updated 2 months ago
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation☆80Updated 2 weeks ago
- A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.☆26Updated 3 weeks ago
- ☆19Updated last week
- (ECCV 2024) Official implementation of Paper ''DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation''☆39Updated 6 months ago
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆75Updated last year
- Official repo for "Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge" ICLR2025☆46Updated last month
- official code repo of CVPR 2025 paper PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation☆21Updated last month
- Diffusion Powers Video Tokenizer for Comprehension and Generation (CVPR 2025)☆66Updated last month
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆65Updated 2 months ago
- This is a repository to collect training-free algorithms for visual generation and manipulation☆32Updated this week