MICV-yonsei/LocalizationHeads

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MICV-yonsei/LocalizationHeads)

MICV-yonsei / LocalizationHeads

[CVPR 2025] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding

☆16

Alternatives and similar repositories for LocalizationHeads

Users that are interested in LocalizationHeads are comparing it to the libraries listed below

Sorting:

MICV-yonsei / CXRL
View on GitHub
[MICCAI 2024 Spotlight✨] Official Pytorch Code for Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning
☆12Sep 4, 2024Updated last year
MICV-yonsei / DragText
View on GitHub
[WACV 2025 ORAL] Official Pytorch Code for DragText: Rethinking Text Embedding in Point-based Image Editing
☆14Jan 22, 2025Updated last year
jdg900 / MMR
View on GitHub
[ICLR 2025] Official Pytorch Implementation of MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segm…
☆24Apr 3, 2025Updated 10 months ago
seilk / LocalizationHeads
View on GitHub
[CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding
☆61Aug 31, 2025Updated 6 months ago
MICV-yonsei / CASS
View on GitHub
[CVPR 2025] Official Pytorch Code for Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
☆46Mar 27, 2025Updated 11 months ago
DavidBondesson / Frequency_ridge_tracking
View on GitHub
Frequency tracking in time-frequency representations
☆13Jan 19, 2021Updated 5 years ago
Lilidamowang / T2VIndexer-generativeSearch
View on GitHub
☆13Aug 28, 2024Updated last year
saurjya / EnsembleSep
View on GitHub
This branch of Asteroid contains code for the vocal harmony and chamber ensemble separation related papers.
☆12Nov 7, 2024Updated last year
Run542968 / GAP
View on GitHub
☆10Oct 13, 2024Updated last year
sudo-Boris / mr-Blip
View on GitHub
Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"
☆92Mar 9, 2025Updated 11 months ago
MICV-yonsei / EAGLE
View on GitHub
[CVPR 2024 Highlight✨] Official Pytorch Code for EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation
☆92Sep 12, 2024Updated last year
stoneMo / EZ-VSL
View on GitHub
Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)
☆40Oct 2, 2022Updated 3 years ago
krafton-ai / Rare-to-Frequent
View on GitHub
Rare-to-Frequent (R2F), ICLR'25, Spotlight
☆53Apr 23, 2025Updated 10 months ago
xmed-lab / TAM
View on GitHub
[ICCV25 Oral] Token Activation Map to Visually Explain Multimodal LLMs
☆179Dec 14, 2025Updated 2 months ago
Vinoground / Vinoground
View on GitHub
☆11Aug 7, 2025Updated 6 months ago
etzinis / optimal_condition_training
View on GitHub
Code and data recipes for the paper: Optimal Condition Training for Target Source Separation by Efthymios Tzinis, Gordon Wichern, Paris S…
☆14Feb 15, 2023Updated 3 years ago
cvlab-kaist / CAMEO
View on GitHub
Official implementation of "CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models"
☆39Updated this week
Nils-Laurent / RRP-RD
View on GitHub
Time frequency ridge detection based on relevant ridge portions
☆11Aug 17, 2023Updated 2 years ago
MIS-DevWorks / FBR
View on GitHub
This repository contains the official code for "Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention, Alignm…
☆12Oct 9, 2024Updated last year
hyungjin-chung / VPS
View on GitHub
☆14Sep 11, 2025Updated 5 months ago
JeongHun0716 / VoxLRS-SA
View on GitHub
This repository contains the speaker labeled information of VoxCeleb2 and LRS3 audio-visual datasets. (AAAI 2025)
☆13Sep 6, 2024Updated last year
SaumyaSaxena / graph_eqa
View on GitHub
Official repository for GraphEQA
☆22Sep 25, 2025Updated 5 months ago
mobiushy / move-act
View on GitHub
☆11Jul 26, 2024Updated last year
baekingeol / Probing-RAG
View on GitHub
[NAACL 2025] Official Code Repository for the paper "Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval"
☆18Jul 13, 2025Updated 7 months ago
cma1114 / activation_steering
View on GitHub
An exploration of LLM steering
☆24Jun 15, 2024Updated last year
NVlabs / FRAG
View on GitHub
☆14Apr 25, 2025Updated 10 months ago
mavceleb / mavceleb_baseline
View on GitHub
☆11Nov 5, 2025Updated 3 months ago
MICV-yonsei / CT2MRI
View on GitHub
[MICCAI 2024 Early Acceptance] Official Pytorch Code for Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Brid…
☆57Jan 7, 2025Updated last year
naver-ai / class-query-vad
View on GitHub
[ECCV 2024] Official PyTorch implementation of "Classification Matters: Improving Video Action Detection with Class-Specific Attention"
☆16Nov 8, 2024Updated last year
nhut-ngnn / Voice-Based-Age-and-Gender-Recogniton
View on GitHub
[ICTC'24] - "Voice-Based Age and Gender Recognition: A Comparative Study of LSTM, RezoNet and Hybrid CNNs-BiLSTM Architecture" by Nhut Mi…
☆10Jan 16, 2025Updated last year
NicolaNardino / Blockchain.Ethereum
View on GitHub
Java web application backed by the Ethereum-Blockchain network. Powered by RESTful web services (JAX-RS && Spring Boot) , Docker, Kuberne…
☆14Feb 19, 2019Updated 7 years ago
JasonQSY / AffordanceLLM
View on GitHub
Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"
☆14Oct 18, 2024Updated last year
YuchuanTian / U-REPA
View on GitHub
[NeurIPS 2025] U-REPA: Aligning Diffusion U-Nets to ViTs
☆33Dec 15, 2025Updated 2 months ago
lucaArrotta / Age-Estimation-based-on-Human-Voice
View on GitHub
Human age estimation using deep neural networks (Keras)
☆13Aug 10, 2023Updated 2 years ago
msaadsaeed / SBNet
View on GitHub
Official implementation of SBNet as described in "Single-branch Network for Multimodal Training".
☆12Aug 28, 2023Updated 2 years ago
AmanoTeam / LyricsPy
View on GitHub
Python library for searching lyrics on Musixmatch, Genius and letras.mus.br.
☆10Oct 10, 2024Updated last year
kjw11 / CSEnet-ASR
View on GitHub
Cross-Speaker Encoding Network for Multi-talker Speech Recognition
☆11Mar 14, 2025Updated 11 months ago
ybayle / ISM2017
View on GitHub
Reproducible research code for the experiments presented in our article "Kara1k: a karaoke dataset for cover song identification and sing…
☆10Jan 9, 2018Updated 8 years ago
ZhihaoDU / du2022sond
View on GitHub
Speaker overlap-aware Neural Diarization
☆12Feb 13, 2023Updated 3 years ago