ModelTC / HarmoniCaLinks
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
☆41Updated 3 weeks ago
Alternatives and similar repositories for HarmoniCa
Users that are interested in HarmoniCa are comparing it to the libraries listed below
Sorting:
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆22Updated last month
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆24Updated 2 months ago
- VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆49Updated 3 weeks ago
- ☆54Updated 3 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆103Updated 2 months ago
- 📚 Collection of token-level model compression resources.☆147Updated last month
- Doodling our way to AGI ✏️ 🖼️ 🧠☆86Updated 2 months ago
- Official repository of "Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models"☆64Updated this week
- Official implement of MIA-DPO☆63Updated 6 months ago
- Survey: https://arxiv.org/pdf/2507.20198☆69Updated this week
- 📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.☆49Updated 2 weeks ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆85Updated last month
- Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆52Updated last month
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆53Updated last week
- ☆62Updated last week
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆117Updated 5 months ago
- ☆194Updated this week
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆96Updated last month
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 4 months ago
- ☆47Updated 2 months ago
- [CVPR] MergeVQ: A Unified Framework for Visual Generation and Representation with Token Merging and Quantization☆42Updated 2 weeks ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆31Updated 7 months ago
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆51Updated 4 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning"☆134Updated 2 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation☆83Updated 2 months ago
- [Arxiv] Discrete Diffusion in Large Language and Multimodal Models: A Survey☆185Updated last month
- [CVPRW 2025] UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inpu…☆86Updated 3 months ago
- A Collection of Papers on Diffusion Language Models☆98Updated this week
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 6 months ago
- SFT+RL boosts multimodal reasoning☆24Updated last month