[CVPR2025] Official implementation of the paper "Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices". (by Junyan Lin)
☆49Oct 29, 2025Updated 8 months ago
Alternatives and similar repositories for Layer_Select_Fuse_for_MLLM
Users that are interested in Layer_Select_Fuse_for_MLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024 Main] Official implementation of the paper "To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimoda…☆17Dec 13, 2024Updated last year
- ☆14Nov 19, 2024Updated last year
- [ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …☆22Nov 17, 2025Updated 7 months ago
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆22Feb 26, 2025Updated last year
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆29Mar 2, 2026Updated 4 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- "Visual Prompt Selection for In-Context Learning Segmentation Framework"☆14Dec 13, 2024Updated last year
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆27May 13, 2026Updated last month
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆48Oct 3, 2024Updated last year
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆28Mar 26, 2025Updated last year
- RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering☆10Nov 27, 2022Updated 3 years ago
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆21Jul 20, 2024Updated last year
- ☆25Dec 26, 2024Updated last year
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆356Jun 20, 2026Updated 2 weeks ago
- Fine-Grained Knowledge Fusion for Retrieval-Augmented Medical Visual Question☆11Jul 18, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos☆24May 21, 2025Updated last year
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆41Mar 27, 2025Updated last year
- ☆17Feb 20, 2025Updated last year
- ☆29Jul 30, 2024Updated last year
- ☆29Dec 5, 2021Updated 4 years ago
- Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".☆162Sep 21, 2025Updated 9 months ago
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆42Sep 10, 2025Updated 9 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆106Aug 8, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆63Nov 5, 2024Updated last year
- ☆26Jun 5, 2025Updated last year
- [BMVC 2024 Oral ✨] Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization☆20Sep 11, 2024Updated last year
- Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective☆15Oct 22, 2024Updated last year
- Retrieval_OOD_for_Multimodal_AI☆11Dec 4, 2024Updated last year
- Official Implementation for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm"☆23May 8, 2026Updated last month
- ☆13Jun 10, 2025Updated last year
- Tools for replaying Drake simulations in Blender☆14May 12, 2026Updated last month
- Data pre-processing and training code on Open-X-Embodiment with pytorch☆11Jan 20, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- MuJoCo benchmark for Deep Reinforcement Learning as provided by Tianshou framework.☆15Jan 12, 2025Updated last year
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)☆52Apr 14, 2025Updated last year
- ☆18Nov 19, 2024Updated last year
- [ICCV2025] Harnessing CLIP, DINO and SAM for Open Vocabulary Segmentation☆122Nov 22, 2025Updated 7 months ago
- ☆116Oct 21, 2025Updated 8 months ago
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆31Apr 8, 2025Updated last year