FoundationVision / Autoregressive-Models-in-Vision-SurveyLinks
The paper collections for the autoregressive models in vision.
☆10Updated 3 months ago
Alternatives and similar repositories for Autoregressive-Models-in-Vision-Survey
Users that are interested in Autoregressive-Models-in-Vision-Survey are comparing it to the libraries listed below
Sorting:
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆35Updated 11 months ago
- Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]☆81Updated 7 months ago
- Code for Paper 'Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach'☆17Updated 8 months ago
- [CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…☆30Updated last month
- The repository for AP-LDM☆14Updated 8 months ago
- AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model☆29Updated this week
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆37Updated last year
- ☆48Updated 3 months ago
- ☆19Updated 2 years ago
- [NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…☆37Updated last year
- ☆17Updated 10 months ago
- The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆51Updated 2 months ago
- Codes accompanying the paper "Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment"☆33Updated 4 months ago
- [arXiv: 2502.05178] QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation☆75Updated 3 months ago
- ☆48Updated this week
- ☆34Updated last week
- Stable Consistency Tuning: Understanding and Improving Consistency models☆16Updated 7 months ago
- the official repo for "D-AR: Diffusion via Autoregressive Models"☆96Updated 2 weeks ago
- The official repo of continuous speculative decoding☆27Updated 2 months ago
- minisora-DiT, a DiT reproduction based on XTuner from the open source community MiniSora☆40Updated last year
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 7 months ago
- [ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model☆16Updated 4 months ago
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆63Updated 4 months ago
- The official repository for the RealSyn dataset☆34Updated last month
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆34Updated 4 months ago
- [ICLR 2025] Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching☆47Updated 2 months ago
- Video Diffusion State Space Models☆19Updated last year
- [ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"☆43Updated 11 months ago
- No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves☆59Updated last week
- [Preprint] GMem: A Modular Approach for Ultra-Efficient Generative Models☆37Updated 3 months ago