ModelTC / HarmoniCaLinks
[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".
☆42Updated 2 months ago
Alternatives and similar repositories for HarmoniCa
Users that are interested in HarmoniCa are comparing it to the libraries listed below
Sorting:
- This repository contains the code for our ICML 2025 paper——LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection🎉☆24Updated 3 months ago
- [ICML2025] Official Code of From Local Details to Global Context: Advancing Vision-Language Models with Attention-Based Selection☆22Updated 2 months ago
- VeriThinker: Learning to Verify Makes Reasoning Model Efficient☆52Updated 2 months ago
- ☆55Updated 4 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆102Updated 3 months ago
- 📚 Collection of token-level model compression resources.☆158Updated 2 weeks ago
- Official implement of MIA-DPO☆65Updated 7 months ago
- Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’☆56Updated 2 months ago
- Code release for VTW (AAAI 2025 Oral)☆49Updated 2 months ago
- [Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics]: VisuoThink: Empowering LVLM Reasoning with Mul…☆29Updated last month
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"☆63Updated last month
- 📖 This is a repository for organizing papers, codes, and other resources related to personalized video generation and editing.☆53Updated last week
- MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models☆41Updated 5 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆34Updated 8 months ago
- ☆43Updated 2 months ago
- Survey: https://arxiv.org/pdf/2507.20198☆139Updated last week
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆98Updated 2 months ago
- SFT+RL boosts multimodal reasoning☆30Updated 2 months ago
- ☆69Updated this week
- Code for the paper "AsFT: Anchoring Safety During LLM Fune-Tuning Within Narrow Safety Basin".☆27Updated 2 months ago
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction☆122Updated 6 months ago
- A Collection of Papers on Diffusion Language Models☆123Updated this week
- Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens (arXiv 2025)☆155Updated last month
- A framework for unified personalized model, achieving mutual enhancement between personalized understanding and generation. Demonstrating…☆121Updated last month
- Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-Tuning☆63Updated 4 months ago
- [ICML 2025] This is the official PyTorch implementation of "ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality…☆54Updated 5 months ago
- A Massive Multi-Discipline Lecture Understanding Benchmark☆30Updated 2 months ago
- TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation☆215Updated last month
- MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision☆25Updated 3 months ago
- (ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning☆55Updated last month