kawhiiiileo / FiCoCoLinks
[AAAI 26'] This is the official pytorch implementation for paper: Filter, Correlate, Compress: Training-Free Token Reduction for MLLM Acceleration
☆45Updated last week
Alternatives and similar repositories for FiCoCo
Users that are interested in FiCoCo are comparing it to the libraries listed below
Sorting:
- Repository for awesome spatial/visual reasoning MLLMs. (focus more on embodied applications)☆72Updated 4 months ago
- OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models☆139Updated 6 months ago
- R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization☆433Updated last month
- The official repo of the paper "MMLongBench Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly"☆162Updated last week
- [AAAI 2026] Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models☆34Updated 2 weeks ago
- ☆28Updated 5 months ago
- [NAACL 2025🔥] MEDA: Dynamic KV Cache Allocation for Efficient Multimodal Long-Context Inference☆14Updated 5 months ago
- ☆60Updated 6 months ago
- Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos☆296Updated last month
- Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model☆36Updated 10 months ago
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆97Updated 4 months ago
- Code release for VTW (AAAI 2025 Oral)☆61Updated 2 weeks ago
- The official implement of "Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings"☆13Updated 11 months ago
- 📚 Collection of token-level model compression resources.☆182Updated 2 months ago
- [NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding☆159Updated 4 months ago
- ☆51Updated last week
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 5 months ago
- The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learn…☆32Updated this week
- ☆166Updated 3 months ago
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Cont…☆62Updated 2 months ago
- A Self-Training Framework for Vision-Language Reasoning☆86Updated 10 months ago
- R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning.☆64Updated 6 months ago
- ✨✨[AAAI 2026] This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Vi…☆73Updated 6 months ago
- [NeurIPS 2025] MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning☆87Updated 2 months ago
- [ICCV 2025] p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay☆43Updated 4 months ago
- ☆72Updated 6 months ago
- This repository introduce a comprehensive paper list, datasets, methods and tools for memory research.☆314Updated 5 months ago
- PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation [NeurIPS 2025]☆13Updated last month
- Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"☆96Updated 2 months ago
- [NeurIPS 2025] Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO☆64Updated 3 weeks ago