ModelTC / HarmoniCaLinks
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
☆43Updated 4 months ago
Alternatives and similar repositories for HarmoniCa
Users that are interested in HarmoniCa are comparing it to the libraries listed below
Sorting:
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆24Updated 5 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆62Updated last month
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆24Updated 4 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆109Updated 4 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆96Updated last week
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆170Updated 2 weeks ago
- 📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.☆58Updated this week
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆231Updated 3 months ago
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 7 months ago
- ☆63Updated 6 months ago
- ☆55Updated 3 months ago
- 📚 Collection of token-level model compression resources.☆182Updated 2 months ago
- ☆60Updated 6 months ago
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆46Updated 4 months ago
- Fast-Slow Thinking for Large Vision-Language Model Reasoning☆21Updated 6 months ago
- Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆57Updated 5 months ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆76Updated 4 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆194Updated 3 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆95Updated 3 weeks ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆29Updated last month
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆87Updated 2 months ago
- ☆132Updated last month
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆134Updated 8 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆113Updated 5 months ago
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆34Updated last week
- The code repository of UniRL☆46Updated 5 months ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆30Updated 4 months ago
- A Collection of Papers on Diffusion Language Models☆145Updated 2 months ago
- Official implement of MIA-DPO☆67Updated 10 months ago
- ☆62Updated 4 months ago