zzezze / NeighborRetrLinks
Official implementation of "NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval (CVPR 2025)"
☆27Updated 5 months ago
Alternatives and similar repositories for NeighborRetr
Users that are interested in NeighborRetr are comparing it to the libraries listed below
Sorting:
- [CVPR2025] Synthetic Data is an Elegant GIFT for Continual Vision-Language Models☆18Updated 3 months ago
- [SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval☆43Updated last year
- Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval [CVPR 2025 Highlight]☆60Updated 2 months ago
- Codes of the Fine-grained Textual Inversion network for Zero-Shot Composed Image Retrieval☆25Updated 6 months ago
- Pytorch Implementation of LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification☆81Updated 2 weeks ago
- Collection of Composed Image Retrieval (CIR) papers.☆266Updated last month
- [CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding☆108Updated last month
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videos☆81Updated 3 weeks ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆57Updated last month
- Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024☆19Updated last year
- Survey: https://arxiv.org/pdf/2507.20198☆150Updated 3 weeks ago
- Code for the paper "Compositional Entailment Learning for Hyperbolic Vision-Language Models".☆83Updated 3 months ago
- ☆33Updated 2 months ago
- 🔥CVPR 2025 Multimodal Large Language Models Paper List☆155Updated 6 months ago
- [ICLR 2025] Official Implementation of Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection☆47Updated 2 months ago
- [NeurIPS 2025] Official code for paper: Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs.☆60Updated 2 weeks ago
- [CVPR 2025 Highlight] Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding☆37Updated last month
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆192Updated 2 months ago
- [✨Official Code of TSPO] Temporal Sampling Policy Optimization for Long-form Video Language Understanding☆50Updated last month
- [ICCV 2025] The official pytorch implement of "LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs".☆17Updated 3 months ago
- ☆23Updated 2 years ago
- [CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models☆46Updated 4 months ago
- Noise of Web (NoW) is a challenging noisy correspondence learning (NCL) benchmark containing 100K image-text pairs for robust image-text …☆15Updated last month
- [ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'☆274Updated 5 months ago
- Official Implementation of GENIUS: A Generative Framework for Universal Multimodal Search, CVPR 2025☆31Updated last month
- Official Implementation of Visual Abstraction: A Plug-and-Play Approach for Text-Visual Retrieval☆22Updated 2 months ago
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆301Updated last week
- [CVPR' 25] Interleaved-Modal Chain-of-Thought☆87Updated last month
- This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our em…☆71Updated 2 months ago
- A Fine-grained Benchmark for Video Captioning and Retrieval☆21Updated 2 months ago