(ToCa-v2) A New version of ToCa,with faster speed and better acceleration!
☆40Mar 13, 2025Updated 11 months ago
Alternatives and similar repositories for DuCa
Users that are interested in DuCa are comparing it to the libraries listed below
Sorting:
- [ICLR2025] Accelerating Diffusion Transformers with Token-wise Feature Caching☆210Mar 14, 2025Updated 11 months ago
- [ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers☆373Feb 16, 2026Updated 2 weeks ago
- ☆18Jun 10, 2025Updated 8 months ago
- (ICCV2025) EEdit⚡: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing☆60Sep 17, 2025Updated 5 months ago
- An official repository for GPTailor☆17Jun 29, 2025Updated 8 months ago
- The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …☆50Jun 6, 2025Updated 8 months ago
- FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.☆52Jul 8, 2024Updated last year
- ☆23May 21, 2025Updated 9 months ago
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆23Dec 8, 2024Updated last year
- [ACL 2025] Can MLLMs Understand the Deep Implication Behind Chinese Images?☆20Oct 20, 2025Updated 4 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆38Jan 27, 2026Updated last month
- This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net…☆16Jul 19, 2024Updated last year
- \infty-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation☆19Feb 14, 2025Updated last year
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Aug 28, 2024Updated last year
- Official Code Implementation for 'A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models'☆20Jul 24, 2024Updated last year
- ☆15Feb 21, 2024Updated 2 years ago
- Code and data for paper "Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation".☆23Oct 22, 2025Updated 4 months ago
- Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation: A framework for generating multimodal music by bridging dif…☆28Jan 21, 2025Updated last year
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- The official code for NeurIPS 2025 "MagCache: Fast Video Generation with Magnitude-Aware Cache"☆260Nov 17, 2025Updated 3 months ago
- Official implemention for Diffusion Models Are Innate One-Step Generators☆26Jun 25, 2025Updated 8 months ago
- [NeurIPS 2024] TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration☆26Oct 17, 2024Updated last year
- ☆78Jan 22, 2026Updated last month
- PiKV: KV Cache Management System for Mixture of Experts [Efficient ML System]☆48Feb 24, 2026Updated last week
- [CVPR 2025] The official implementation of "CacheQuant: Comprehensively Accelerated Diffusion Models"☆44Nov 2, 2025Updated 4 months ago
- [ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache☆43Jul 26, 2024Updated last year
- [ICLR 2025] FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality☆261Dec 27, 2024Updated last year
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆117Jul 15, 2024Updated last year
- Official repository of "CoMP: Continual Multimodal Pre-training for Vision Foundation Models"☆45Apr 3, 2025Updated 11 months ago
- Official repo of the ICLR 2025 paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"☆28Jul 15, 2025Updated 7 months ago
- Make Your Training Flexible: Towards Deployment-Efficient Video Models☆37Jun 11, 2025Updated 8 months ago
- [CVPR'25] Official repository for "Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Eva…☆44Jan 7, 2026Updated last month
- ☆39May 20, 2025Updated 9 months ago
- Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model☆1,271Jun 8, 2025Updated 8 months ago
- [CVPR2026] VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice☆65Updated this week
- ☆32May 26, 2024Updated last year
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year