Text Proxy: Decomposing Retrieval from a 1-to-N Relationship into N 1-to-1 Relationships for Text-Video Retrieval -- AAAI2025
☆17Jul 14, 2025Updated 8 months ago
Alternatives and similar repositories for Text-Proxy
Users that are interested in Text-Proxy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code implementation of paper "MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval (AAAI2025)"☆25Feb 2, 2025Updated last year
- [CVPR 2024] TeachCLIP for Text-to-Video Retrieval☆42May 7, 2025Updated 10 months ago
- [NeurIPS 2023] The official implementation of paper "Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval" acce…☆27May 14, 2024Updated last year
- [ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval☆13Nov 5, 2023Updated 2 years ago
- https://layer6ai-labs.github.io/xpool/☆134Jul 1, 2023Updated 2 years ago
- ICCV'23 Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval☆19Aug 22, 2025Updated 7 months ago
- This is a sample of recommender system based on keywords from local top-ranking news and provides candidate visiting routes. The default …☆18Aug 14, 2021Updated 4 years ago
- ☆20Jul 28, 2025Updated 7 months ago
- ☆10Nov 27, 2024Updated last year
- [NeurIPS 2022 Spotlight] Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations☆147Apr 9, 2024Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Jun 7, 2024Updated last year
- Source code of our AAAI 2024 paper "Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval"☆55Mar 28, 2024Updated last year
- The official implementation of "Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Ma…☆13Sep 13, 2024Updated last year
- [ICLR 2025] TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval☆26Feb 13, 2025Updated last year
- ☆12Dec 15, 2023Updated 2 years ago
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆22Jun 23, 2025Updated 9 months ago
- Official implementation of "Meta-Entity Driven Triplet Mining for Aligning Medical Vision-Language Models"☆14Mar 19, 2025Updated last year
- ☆36Mar 28, 2024Updated last year
- [ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval☆41Apr 11, 2025Updated 11 months ago
- Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval --ICCV2023 Oral☆91Nov 2, 2023Updated 2 years ago
- Source code of our MM'22 paper Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning☆21Jun 20, 2024Updated last year
- ☆13Aug 14, 2022Updated 3 years ago
- 由Hugo+gitpages搭建的个人博客 | 访问地址为 https://ephmeral.github.io☆10Dec 30, 2022Updated 3 years ago
- Pytorch Implementation of LoG 22 [Oral] -- Transductive Linear Probing: A Novel Framework for Few-Shot Node Classification☆17May 31, 2023Updated 2 years ago
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆58Aug 19, 2025Updated 7 months ago
- ☆29Jun 9, 2025Updated 9 months ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆21Aug 1, 2025Updated 7 months ago
- Universal Video Temporal Grounding with Generative Multi-modal Large Language Models☆47Nov 25, 2025Updated 4 months ago
- Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)☆60May 26, 2024Updated last year
- The official code of paper "Multi-to-Single: Reducing Multimodal Dependency in Emotion Recognition through Contrastive Learning" (AAAI 20…☆30Sep 30, 2025Updated 5 months ago
- Source code of our MM'22 paper Partially Relevant Video Retrieval☆55Nov 4, 2024Updated last year
- ☆10Mar 31, 2025Updated 11 months ago
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago
- [EMNLP’24 Main] Encoding and Controlling Global Semantics for Long-form Video Question Answering☆18Oct 9, 2024Updated last year
- Repository of "Improving Cross-Modal Retrieval With Set of Diverse Embeddings" (CVPR'23, Highlight)☆41Nov 15, 2023Updated 2 years ago
- QWEN 2.5VL-R1: Multimodal reasoning model for action recognition in videos (Experimental GRPO with LoRA support)☆23Oct 9, 2025Updated 5 months ago
- A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CV…☆83Jan 20, 2026Updated 2 months ago
- ☆19Mar 5, 2025Updated last year