locuslab/llava-token-compression

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/locuslab/llava-token-compression)

locuslab / llava-token-compression

☆47

Alternatives and similar repositories for llava-token-compression

Users that are interested in llava-token-compression are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ywh187 / FitPrune
View on GitHub
☆68Jan 23, 2026Updated 5 months ago
luka-group / vlm-knowledge-conflict
View on GitHub
Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."
☆54Oct 19, 2024Updated last year
Seeing-Fast-and-Slow / Seeing-Fast-and-Slow
View on GitHub
☆16May 28, 2026Updated last month
VITA-Group / SSM-Bottleneck
View on GitHub
[ICLR'25] "Understanding Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing" by Peihao Wang, Ruisi Cai, Yue…
☆18Mar 21, 2025Updated last year
SUSTechBruce / LOOK-M
View on GitHub
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆103Nov 9, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
LaVi-Lab / Visual-Table
View on GitHub
[EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"
☆20Oct 17, 2024Updated last year
locuslab / T-MARS
View on GitHub
Code for T-MARS data filtering
☆35Aug 23, 2023Updated 2 years ago
CircleRadon / TokenPacker
View on GitHub
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025
☆278May 26, 2025Updated last year
FatemehShiri / Spatial-MM
View on GitHub
☆12Jan 10, 2025Updated last year
Victorwz / LLaVA-Unified
View on GitHub
☆23Aug 27, 2025Updated 10 months ago
Beckschen / LLaVolta
View on GitHub
[NeurIPS 2024] Efficient Large Multi-modal Models via Visual Context Compression
☆66Feb 19, 2025Updated last year
LanqingL / SCS
View on GitHub
"Visual Prompt Selection for In-Context Learning Segmentation Framework"
☆14Dec 13, 2024Updated last year
UCSB-AI / Discffusion
View on GitHub
Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"
☆29Apr 27, 2024Updated 2 years ago
baaivision / EVE
View on GitHub
EVE Series: Encoder-Free Vision-Language Models from BAAI
☆374Jul 24, 2025Updated 11 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
lzhxmu / VTW
View on GitHub
Code release for VTW (AAAI 2025 Oral)
☆68Nov 4, 2025Updated 8 months ago
JieShibo / MemVP
View on GitHub
[ICML 2024] Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
☆49May 12, 2024Updated 2 years ago
MLLM-Data-Contamination / MM-Detect
View on GitHub
Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM | EMNLP 2025 Findings
☆18Oct 17, 2025Updated 9 months ago
jonathan-roberts1 / SciFIBench
View on GitHub
NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
☆13May 24, 2025Updated last year
szzexpoi / POEM
View on GitHub
Official Implementation for CVPR 2023 paper "Divide and Conquer: Answering Questions with Object Factorization and Compositional Reasonin…
☆10Jun 16, 2024Updated 2 years ago
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
View on GitHub
☆12Feb 11, 2026Updated 5 months ago
HJYao00 / DenseConnector
View on GitHub
【NeurIPS 2024】Dense Connector for MLLMs
☆183Oct 14, 2024Updated last year
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
hulianyuyy / iLLaVA
View on GitHub
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models (ICLR2026)
☆23Jun 24, 2026Updated 3 weeks ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
CyberAgentAILab / regularized-bon
View on GitHub
Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).
☆14Apr 4, 2025Updated last year
tianyi-lab / R2-T2
View on GitHub
[ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"
☆19Mar 10, 2025Updated last year
Gumpest / SparseVLMs
View on GitHub
[ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".
☆266Dec 22, 2025Updated 6 months ago
jmiemirza / Meta-Prompting
View on GitHub
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs (ECCV 2024)
☆20Jul 15, 2024Updated 2 years ago
shiqichen17 / VLM_Merging
View on GitHub
Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)
☆89Jun 9, 2026Updated last month
apple / ml-reversal-blessing
View on GitHub
☆17Jul 31, 2025Updated 11 months ago
bfshi / scaling_on_scales
View on GitHub
When do we not need larger vision models?
☆420Feb 8, 2025Updated last year
FloatButterfly / LCIC_plus
View on GitHub
The extented code of layered conceptual image compression. Journal submitted.
☆15Aug 29, 2022Updated 3 years ago
SHI-Labs / VisPer-LM
View on GitHub
[NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
☆73Oct 17, 2025Updated 9 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Cooperx521 / PyramidDrop
View on GitHub
(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction
☆151Mar 6, 2025Updated last year
Yuqifan1117 / HalluciDoctor
View on GitHub
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data (Accepted by CVPR 2024)
☆52Jul 16, 2024Updated 2 years ago
SaraGhazanfari / EMMA
View on GitHub
EMMA [TMLR 2025]
☆14Sep 25, 2025Updated 9 months ago
NathanGodey / qfilters
View on GitHub
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆34Mar 7, 2025Updated last year
kiaia / GIRAFFE
View on GitHub
Extending context length of visual language models
☆12Dec 18, 2024Updated last year
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
thunlp / LLaVA-UHD
View on GitHub
LLaVA-UHD v3: Progressive Visual Compression for Efficient Native-Resolution Encoding in MLLMs
☆423Jul 6, 2026Updated 2 weeks ago