sangminwoo / awesome-token-redundancy-reductionLinks
π Awesome papers on token redundancy reduction
β11Updated 10 months ago
Alternatives and similar repositories for awesome-token-redundancy-reduction
Users that are interested in awesome-token-redundancy-reduction are comparing it to the libraries listed below
Sorting:
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.β80Updated 3 months ago
- β64Updated last week
- [EMNLP 2025 main π₯] Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"β100Updated 3 months ago
- Code release for VTW (AAAI 2025 Oral)β64Updated 2 months ago
- LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Modelsβ162Updated 4 months ago
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Modelβ37Updated last year
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.β104Updated 7 months ago
- (CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reductionβ141Updated 10 months ago
- [CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Modelsβ97Updated 2 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"β54Updated 3 months ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visioβ¦β44Updated 9 months ago
- [ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillationβ221Updated 10 months ago
- Latest open-source "Thinking with images" (O3/O4-mini) papers, covering training-free, SFT-based, and RL-enhanced methods for "fine-grainβ¦β110Updated 5 months ago
- Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimizationβ100Updated 2 years ago
- β¨β¨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Viβ¦β76Updated 9 months ago
- β110Updated last year
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learningβ45Updated 6 months ago
- [EMNLP 2024 Findingsπ₯] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inβ¦β103Updated last year
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Contβ¦β68Updated 4 months ago
- π Collection of token-level model compression resources.β189Updated 4 months ago
- [TMLR 2025] Efficient Reasoning Models: A Surveyβ296Updated 3 weeks ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Modelsβ77Updated last year
- A RLHF Infrastructure for Vision-Language Modelsβ193Updated last year
- [ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference" and "Spβ¦β234Updated last month
- [NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.β84Updated 4 months ago
- PyTorch Implementation of "Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Largβ¦β36Updated last month
- [ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigationβ132Updated 4 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Modelsβ152Updated 6 months ago
- Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learningβ23Updated 7 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentationβ104Updated 4 months ago