Cooperx521/PyramidDrop

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Cooperx521/PyramidDrop)

Cooperx521 / PyramidDrop

(CVPR 2025) PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

☆151

Alternatives and similar repositories for PyramidDrop

Users that are interested in PyramidDrop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Cooperx521 / ScaleCap
View on GitHub
(ICLR 2026)Official repository of 'ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing’
☆60Jan 26, 2026Updated 5 months ago
pkunlp-icler / FastV
View on GitHub
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…
☆592Jan 4, 2025Updated last year
hasanar1f / HiRED
View on GitHub
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…
☆58Apr 18, 2025Updated last year
Theia-4869 / FasterVLM
View on GitHub
Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.
☆114Jun 29, 2025Updated last year
liuting20 / MustDrop
View on GitHub
Multi-Stage Vision Token Dropping: Towards Efficient Multimodal Large Language Model
☆36Jan 8, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
InternLM / Spark
View on GitHub
An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"
☆25Oct 23, 2025Updated 8 months ago
Liuziyu77 / MIA-DPO
View on GitHub
Official implement of MIA-DPO
☆69Jan 23, 2025Updated last year
shikiw / Modality-Integration-Rate
View on GitHub
[ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…
☆113Jul 9, 2025Updated last year
SUSTechBruce / LOOK-M
View on GitHub
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆103Nov 9, 2024Updated last year
Gumpest / SparseVLMs
View on GitHub
[ICML'25] Official implementation of paper "SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference".
☆266Dec 22, 2025Updated 7 months ago
ywh187 / FitPrune
View on GitHub
☆68Jan 23, 2026Updated 5 months ago
beichenzbc / BoostStep
View on GitHub
official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"
☆37Jan 21, 2025Updated last year
ZichenWen1 / DART
View on GitHub
[EMNLP 2025 main 🔥] Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"
☆121Oct 12, 2025Updated 9 months ago
OpenIXCLab / CODA
View on GitHub
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
☆37Aug 28, 2025Updated 10 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
KD-TAO / DyCoke
View on GitHub
[CVPR 2025] DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models
☆113Nov 22, 2025Updated 8 months ago
daixiangzi / Awesome-Token-Compress
View on GitHub
A paper list of some recent works about Token Compress for Vit and VLM
☆939Updated this week
orailix / PACT
View on GitHub
[CVPR 2025] PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models
☆60Jan 30, 2026Updated 5 months ago
InternLM / EndoCoT
View on GitHub
[ECCV 2026] An official implementation of "EndoCoT". Scaling endogenous Chain-of-Thought (CoT) reasoning in diffusion models for complex …
☆43Jun 26, 2026Updated 3 weeks ago
TerryPei / CSP
View on GitHub
Cross-Self KV Cache Pruning for Efficient Vision-Language Inference
☆10Dec 15, 2024Updated last year
bcmi / Granular-GRPO
View on GitHub
[CVPR 2026] Fine-Grained GRPO for Precise Preference Alignment in Flow Models
☆64Jun 1, 2026Updated last month
InternLM / ARC-VL
View on GitHub
[CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"
☆46Nov 26, 2025Updated 7 months ago
lzhxmu / VTW
View on GitHub
Code release for VTW (AAAI 2025 Oral)
☆68Nov 4, 2025Updated 8 months ago
Wiselnn570 / VideoRoPE
View on GitHub
[ICML 2025 Oral] An official implementation of VideoRoPE & VideoRoPE++
☆223Apr 15, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
InternLM / CapRL
View on GitHub
[ICLR 2026] An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
☆225Jun 23, 2026Updated 3 weeks ago
mrwu-mac / ControlMLLM
View on GitHub
[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'
☆210Jul 17, 2025Updated last year
42Shawn / LLaVA-PruMerge
View on GitHub
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
☆173Mar 8, 2026Updated 4 months ago
InternLM / ETCHR
View on GitHub
A question-conditioned, reasoning-aware image editor designed to serve as a decoupled visual reasoning assistant for Multimodal Large Lan…
☆23May 25, 2026Updated last month
Visual-AI / PruneVid
View on GitHub
[ACL 2025] PruneVid: Visual Token Pruning for Efficient Video Large Language Models
☆72May 15, 2025Updated last year
xiaoachen98 / Open-LLaVA-NeXT
View on GitHub
An open-source implementation for training LLaVA-NeXT.
☆439Oct 23, 2024Updated last year
Yxxxb / VoCo-LLaMA
View on GitHub
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
☆205Jun 18, 2025Updated last year
OpenEvaluation / VLMEvalKit
View on GitHub
☆23Apr 11, 2026Updated 3 months ago
Mark12Ding / Dispider
View on GitHub
[CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
☆180Mar 23, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
G-JWLee / TAMP
View on GitHub
☆12May 15, 2025Updated last year
Bujiazi / ByTheWay
View on GitHub
[CVPR 2025] Official implementation of ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
☆48Oct 10, 2025Updated 9 months ago
JIA-Lab-research / VisionZip
View on GitHub
Official repository for VisionZip (CVPR 2025)
☆443Jul 21, 2025Updated last year
shikiw / Awesome-MLLM-Hallucination
View on GitHub
Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)
☆103Nov 21, 2024Updated last year
hulianyuyy / iLLaVA
View on GitHub
iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models (ICLR2026)
☆23Jun 24, 2026Updated 3 weeks ago
vbdi / divprune
View on GitHub
[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
☆86Apr 16, 2026Updated 3 months ago
zhangbaijin / From-Redundancy-to-Relevance
View on GitHub
[NAACL 2025 Oral] From redundancy to relevance: Enhancing explainability in multimodal large language models
☆130Jan 30, 2026Updated 5 months ago