[CVPR2025] Official implementation of the paper "Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices". (by Junyan Lin)
☆48Oct 29, 2025Updated 6 months ago
Alternatives and similar repositories for Layer_Select_Fuse_for_MLLM
Users that are interested in Layer_Select_Fuse_for_MLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024 Main] Official implementation of the paper "To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimoda…☆17Dec 13, 2024Updated last year
- [EMNLP 2024 Main] Official implementation of the paper "The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Langua…☆13Nov 11, 2024Updated last year
- [ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …☆22Nov 17, 2025Updated 6 months ago
- [EMNLP 2024 Main] Official implementation of the paper "Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mech…☆16Oct 8, 2024Updated last year
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆22Feb 26, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- (ACL2025 Findings) Official code for the paper "STeCa: Step-level Trajectory Calibration for LLM Agent Learning"☆27Mar 2, 2026Updated 2 months ago
- "Visual Prompt Selection for In-Context Learning Segmentation Framework"☆15Dec 13, 2024Updated last year
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆25May 13, 2026Updated last week
- [CVPR25] CoLLM: A Large Language Model for Composed Image Retrieval☆29Mar 26, 2025Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆22Jul 20, 2024Updated last year
- ☆25Dec 26, 2024Updated last year
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆327May 13, 2026Updated last week
- The official repository of the paper 'Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine'☆127Jan 9, 2025Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Controlled Text Generation Image Dataset☆28Apr 8, 2024Updated 2 years ago
- ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos☆23May 21, 2025Updated last year
- SS-MAE: Spatial-Spectral Masked Auto-Encoder for Mulit-Source Remote Sensing Image Classification (IEEE TGRS 2023)☆58Mar 13, 2024Updated 2 years ago
- pytorch version of svtr model☆27May 24, 2022Updated 4 years ago
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆41Mar 27, 2025Updated last year
- ☆33Oct 21, 2025Updated 7 months ago
- ☆17Feb 20, 2025Updated last year
- ☆28Jul 30, 2024Updated last year
- Official implementation of "VIRAL: Visual Representation Alignment for MLLMs".☆158Sep 21, 2025Updated 8 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆42Sep 10, 2025Updated 8 months ago
- [CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories☆103Aug 8, 2025Updated 9 months ago
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆63Nov 5, 2024Updated last year
- Repository of Streaming LLMs☆63May 13, 2026Updated last week
- Official code for paper "SPA-RL: Reinforcing LLM Agent via Stepwise Progress Attribution"☆85Sep 13, 2025Updated 8 months ago
- Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective☆15Oct 22, 2024Updated last year
- Retrieval_OOD_for_Multimodal_AI☆11Dec 4, 2024Updated last year
- Tools for replaying Drake simulations in Blender☆14May 12, 2026Updated last week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Data pre-processing and training code on Open-X-Embodiment with pytorch☆11Jan 20, 2025Updated last year
- MuJoCo benchmark for Deep Reinforcement Learning as provided by Tianshou framework.☆15Jan 12, 2025Updated last year
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)☆51Apr 14, 2025Updated last year
- ☆18Nov 19, 2024Updated last year
- TongjiThesis Docker 环境 | Docker environment for TongjiThesis (Tongji University thesis LaTeX template)☆12Mar 28, 2026Updated last month
- ☆29Apr 8, 2025Updated last year