ModelTC / HarmoniCaLinks
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
☆44Updated 5 months ago
Alternatives and similar repositories for HarmoniCa
Users that are interested in HarmoniCa are comparing it to the libraries listed below
Sorting:
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆25Updated 6 months ago
- [NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆63Updated 2 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆112Updated 5 months ago
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆24Updated 5 months ago
- LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling☆129Updated this week
- A collection of awesome think with videos papers.☆72Updated 2 weeks ago
- ☆29Updated last week
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆234Updated 3 months ago
- ☆55Updated 3 months ago
- [NeurIPS 2025 Spotlight] Official PyTorch implementation of Vgent☆30Updated 2 weeks ago
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆90Updated 2 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆119Updated 2 weeks ago
- 📚 Collection of token-level model compression resources.☆185Updated 3 months ago
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆205Updated 4 months ago
- ☆147Updated 2 weeks ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆31Updated 4 months ago
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 8 months ago
- This is a collection of recent papers on reasoning in video generation models.☆76Updated last week
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆106Updated last month
- Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆70Updated last week
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆35Updated last week
- Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆87Updated 2 weeks ago
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆181Updated 3 weeks ago
- ☆65Updated 7 months ago
- Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆58Updated 5 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.☆60Updated this week
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆53Updated 8 months ago
- Official Implementation of Muddit [Meissonic II]: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model.☆95Updated last month
- ☆62Updated 7 months ago
- A Comprehensive Dataset for Advanced Image Generation and Editing}☆30Updated 2 months ago