[CVPR 2025] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
☆16Oct 4, 2025Updated 4 months ago
Alternatives and similar repositories for LocalizationHeads
Users that are interested in LocalizationHeads are comparing it to the libraries listed below
Sorting:
- [MICCAI 2024 Spotlight✨] Official Pytorch Code for Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning☆12Sep 4, 2024Updated last year
- [WACV 2025 ORAL] Official Pytorch Code for DragText: Rethinking Text Embedding in Point-based Image Editing☆14Jan 22, 2025Updated last year
- [ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…☆24Apr 3, 2025Updated 10 months ago
- [CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding☆61Aug 31, 2025Updated 6 months ago
- [CVPR 2025] Official Pytorch Code for Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation☆46Mar 27, 2025Updated 11 months ago
- Frequency tracking in time-frequency representations☆13Jan 19, 2021Updated 5 years ago
- ☆13Aug 28, 2024Updated last year
- This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.☆12Nov 7, 2024Updated last year
- ☆10Oct 13, 2024Updated last year
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"☆92Mar 9, 2025Updated 11 months ago
- [CVPR 2024 Highlight✨] Official Pytorch Code for EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation☆92Sep 12, 2024Updated last year
- Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)☆40Oct 2, 2022Updated 3 years ago
- Rare-to-Frequent (R2F), ICLR'25, Spotlight☆53Apr 23, 2025Updated 10 months ago
- [ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs☆179Dec 14, 2025Updated 2 months ago
- ☆11Aug 7, 2025Updated 6 months ago
- Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…☆14Feb 15, 2023Updated 3 years ago
- Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"☆39Updated this week
- Time frequency ridge detection based on relevant ridge portions☆11Aug 17, 2023Updated 2 years ago
- This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…☆12Oct 9, 2024Updated last year
- ☆14Sep 11, 2025Updated 5 months ago
- This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)☆13Sep 6, 2024Updated last year
- Official repository for GraphEQA☆22Sep 25, 2025Updated 5 months ago
- ☆11Jul 26, 2024Updated last year
- [NAACL 2025] Official Code Repository for the paper "Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval"☆18Jul 13, 2025Updated 7 months ago
- An exploration of LLM steering☆24Jun 15, 2024Updated last year
- ☆14Apr 25, 2025Updated 10 months ago
- ☆11Nov 5, 2025Updated 3 months ago
- [MICCAI 2024 Early Acceptance] Official Pytorch Code for Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Brid…☆57Jan 7, 2025Updated last year
- [ECCV 2024] Official PyTorch implementation of "Classification Matters: Improving Video Action Detection with Class-Specific Attention"☆16Nov 8, 2024Updated last year
- [ICTC'24] - "Voice-Based Age and Gender Recognition: A Comparative Study of LSTM, RezoNet and Hybrid CNNs-BiLSTM Architecture" by Nhut Mi…☆10Jan 16, 2025Updated last year
- Java web application backed by the Ethereum-Blockchain network. Powered by RESTful web services (JAX-RS && Spring Boot) , Docker, Kuberne…☆14Feb 19, 2019Updated 7 years ago
- Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"☆14Oct 18, 2024Updated last year
- [NeurIPS 2025] U-REPA: Aligning Diffusion U-Nets to ViTs☆33Dec 15, 2025Updated 2 months ago
- Human age estimation using deep neural networks (Keras)☆13Aug 10, 2023Updated 2 years ago
- Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".☆12Aug 28, 2023Updated 2 years ago
- Python library for searching lyrics on Musixmatch, Genius and letras.mus.br.☆10Oct 10, 2024Updated last year
- Cross-Speaker Encoding Network for Multi-talker Speech Recognition☆11Mar 14, 2025Updated 11 months ago
- Reproducible research code for the experiments presented in our article "Kara1k: a karaoke dataset for cover song identification and sing…☆10Jan 9, 2018Updated 8 years ago
- Speaker overlap-aware Neural Diarization☆12Feb 13, 2023Updated 3 years ago