[ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality"
☆53Mar 25, 2025Updated 11 months ago
Alternatives and similar repositories for ZipAR
Users that are interested in ZipAR are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆32Mar 30, 2025Updated 11 months ago
- [ICCV 2025] The official implementation of "Neighboring Autoregressive Modeling for Efficient Visual Generation"☆59Apr 5, 2025Updated 10 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆23Mar 29, 2024Updated last year
- [ICLR 2024 Spotlight] This is the official PyTorch implementation of "EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Di…☆68Jun 4, 2024Updated last year
- Streaming Video Diffusion: Online Video Editing with Diffusion Models☆18Jun 3, 2024Updated last year
- [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient☆108Sep 27, 2025Updated 5 months ago
- [ICLR 2025] Official PyTorch implmentation of paper "T-Stitch: Accelerating Sampling in Pre-trained Diffusion Models with Trajectory Stit…☆104Feb 26, 2024Updated 2 years ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆103Mar 12, 2024Updated last year
- ☆16Sep 12, 2023Updated 2 years ago
- a collection of awesome autoregressive visual generation models☆79Apr 17, 2025Updated 10 months ago
- ☆15Mar 21, 2025Updated 11 months ago
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 11 months ago
- Curated list of methods that focuses on improving the efficiency of diffusion models☆44Jul 9, 2024Updated last year
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆314Dec 23, 2024Updated last year
- ☆190Jan 14, 2025Updated last year
- [CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>☆155Jan 14, 2026Updated last month
- Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).☆98Feb 11, 2025Updated last year
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆149Mar 21, 2025Updated 11 months ago
- ☆12Dec 20, 2024Updated last year
- ☆18Nov 26, 2025Updated 3 months ago
- ☆11Jan 3, 2024Updated 2 years ago
- [ACL 2024] Official PyTorch implementation of "IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact"☆47May 24, 2024Updated last year
- FQGAN: Factorized Visual Tokenization and Generation☆59Mar 29, 2025Updated 11 months ago
- [COLM 2025] Official PyTorch implementation of "Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models"☆71Jul 8, 2025Updated 7 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆16Feb 15, 2025Updated last year
- Code Repository for the NeurIPS 2022 paper: "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights".☆17Jul 10, 2024Updated last year
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆204Feb 10, 2025Updated last year
- [ICCV 2025] QuEST: Efficient Finetuning for Low-bit Diffusion Models☆57Jun 26, 2025Updated 8 months ago
- Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM☆14Dec 27, 2023Updated 2 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- ☆15Sep 22, 2025Updated 5 months ago
- [NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.☆180Oct 3, 2024Updated last year
- ☆11Sep 7, 2024Updated last year
- ☆15Sep 22, 2024Updated last year
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 8 months ago
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 9 months ago
- super-resolution; post-training quantization; model compression☆14Nov 10, 2023Updated 2 years ago