[CVPR2025] Official implementation of the paper "Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices". (by Junyan Lin)
☆45Oct 29, 2025Updated 4 months ago
Alternatives and similar repositories for Layer_Select_Fuse_for_MLLM
Users that are interested in Layer_Select_Fuse_for_MLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2024 Main] Official implementation of the paper "To Preserve or To Compress: An In-Depth Study of Connector Selection in Multimoda…☆17Dec 13, 2024Updated last year
- [EMNLP 2024 Main] Official implementation of the paper "The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Langua…☆13Nov 11, 2024Updated last year
- ☆14Nov 19, 2024Updated last year
- [EMNLP 2024 Main] Official implementation of the paper "Unveiling In-Context Learning: A Coordinate System to Understand Its Working Mech…☆16Oct 8, 2024Updated last year
- Official PyTorch codebase for the Modeling Caption Diversity in ContrastiveVision-Language Pretraining paper.☆18Mar 28, 2025Updated 11 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- [ACL 2025 Findings] Official implementation of the paper "Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning".☆20Feb 26, 2025Updated last year
- "Visual Prompt Selection for In-Context Learning Segmentation Framework"☆15Dec 13, 2024Updated last year
- Matlab implementation of our TMM 2020 paper "Pixel-level Non-local Image Smoothing with Objective Evaluation"☆10Nov 24, 2020Updated 5 years ago
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆25Dec 4, 2025Updated 3 months ago
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆47Oct 3, 2024Updated last year
- [ECCV24] VISA: Reasoning Video Object Segmentation via Large Language Model☆19Jul 20, 2024Updated last year
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆303Updated this week
- The official repository of the paper 'Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine'☆122Jan 9, 2025Updated last year
- [ECCV'24] Official Implementation of Autoregressive Visual Entity Recognizer.☆14Mar 2, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official Code for the ICCV23 Paper: "LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval…☆40Oct 14, 2023Updated 2 years ago
- [CVPR 2025] COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training☆39Mar 27, 2025Updated 11 months ago
- ☆17Feb 20, 2025Updated last year
- This repo is the official pytorch implementation of the paper: CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-V…☆40Sep 10, 2025Updated 6 months ago
- [TPAMI 2023] Object Affinity Learning: Towards Annotation-free Instance Segmentation☆14Sep 14, 2023Updated 2 years ago
- ☆23Jun 5, 2025Updated 9 months ago
- Retrieval_OOD_for_Multimodal_AI☆11Dec 4, 2024Updated last year
- Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective☆14Oct 22, 2024Updated last year
- Official Implementation for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm"☆21Mar 18, 2026Updated last week
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- MuJoCo benchmark for Deep Reinforcement Learning as provided by Tianshou framework.☆15Jan 12, 2025Updated last year
- [AAAI 2026 Oral] SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation☆37Nov 24, 2025Updated 4 months ago
- This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).☆34Mar 5, 2024Updated 2 years ago
- ☆114Oct 21, 2025Updated 5 months ago
- [NeurIPS'24] Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation (Diffews)☆49Apr 14, 2025Updated 11 months ago
- ☆18Nov 19, 2024Updated last year
- ☆28Apr 8, 2025Updated 11 months ago
- Spectral Graph Attention Network with Fast Eigen-approximation☆12Dec 24, 2021Updated 4 years ago
- Hyperbolic Safety-Aware Vision-Language Models. CVPR 2025☆30Apr 8, 2025Updated 11 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- ☆21Mar 6, 2026Updated 2 weeks ago
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆54Feb 10, 2025Updated last year
- Recent Advances in Visual Dialog☆30Aug 19, 2022Updated 3 years ago
- Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation (NeurIPS 23)☆12May 7, 2025Updated 10 months ago
- [NeurlPS 2024] One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos☆146Dec 26, 2024Updated last year
- 비디오 기반 인공지능 대화시스템☆11Aug 16, 2023Updated 2 years ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago