zhangbw17/MV-Adapter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhangbw17/MV-Adapter)

zhangbw17 / MV-Adapter

An official pytorch implementation of the paper: [MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval].

☆14

Alternatives and similar repositories for MV-Adapter

Users that are interested in MV-Adapter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

reddyav1 / RoCoG-v2
View on GitHub
RoCoG-v2 (Robot Control Gestures) is a dataset intended to support the study of synthetic-to-real and ground-to-air video domain adaptati…
☆17Mar 28, 2024Updated 2 years ago
ZhangWeihang99 / HVSA
View on GitHub
Official PyTorch implementation for Hypersphere-Based Remote Sensing Cross-Modal Text–Image Retrieval via Curriculum Learning.
☆16Aug 10, 2024Updated last year
LgQu / LeaPRR
View on GitHub
Learnable Pillar-based Re-ranking for Image-Text Retrieval. SIGIR'23
☆22Jul 31, 2023Updated 2 years ago
MattWallingford / TAPS
View on GitHub
Pytorch Implementation of Task Adaptive Parameter Sharing for Multi-Task Learning (CVPR 2022)
☆27Jul 11, 2023Updated 3 years ago
TangXu-Group / Cross-modal-remote-sensing-image-and-text-retrieval-models
View on GitHub
☆22Sep 19, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ZhanYang-nwpu / PE-RSITR
View on GitHub
Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval, 2023
☆29Jan 14, 2024Updated 2 years ago
seekerhuang / HarMA
View on GitHub
[ICLRW 2024] Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment
☆64Jul 18, 2024Updated 2 years ago
kennymckormick / ARAS-Dataset
View on GitHub
☆11Nov 5, 2024Updated last year
nchucvml / STVT
View on GitHub
Video Summarization With Spatiotemporal Vision Transformer
☆23Jul 5, 2023Updated 3 years ago
jicheol93 / PLOT
View on GitHub
☆13Feb 13, 2025Updated last year
sqiangcao99 / E2E-LOAD
View on GitHub
☆21Jul 26, 2023Updated 2 years ago
QinYang79 / DECL
View on GitHub
Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval ( ACM Multimedia 2022, Pytorch Code)
☆49Mar 21, 2024Updated 2 years ago
jdzurik / ElasticSearch-WindowsInstaller
View on GitHub
Elasticsearch Windows Service Installer
☆24Dec 16, 2024Updated last year
Collebt / EM-CVGL
View on GitHub
Code of Learning Cross-view Visual Geo-localization without Ground Truth
☆12Feb 17, 2025Updated last year
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
MediaBrain-SJTU / GSC
View on GitHub
☆14Jul 13, 2024Updated 2 years ago
VisualAIKHU / Keyword-DETR
View on GitHub
Official Repository for "Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection" (AAAI …
☆15Mar 1, 2025Updated last year
shuanglinyan / CFine
View on GitHub
CLIP-Driven Fine-grained Text-Image Person Re-identification
☆66Nov 22, 2023Updated 2 years ago
CrossmodalGroup / ESL
View on GitHub
☆12May 3, 2024Updated 2 years ago
taewhankim / VIPCAP
View on GitHub
☆15Dec 31, 2024Updated last year
ByZ0e / Glance-Focus
View on GitHub
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
☆31Jun 28, 2024Updated 2 years ago
stogiannidis / srbench
View on GitHub
Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"
☆19Feb 1, 2026Updated 5 months ago
QinYang79 / CRCL
View on GitHub
Cross-modal Active Complementary Learning with Self-refining Correspondence (NeurIPS 2023, Pytorch Code)
☆15Jun 6, 2024Updated 2 years ago
dibschat / ProVideLLM
View on GitHub
[ICCV 2025] Streaming VideoLLMs for Real-time Procedural Video Understanding
☆18Oct 26, 2025Updated 9 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
THUKElab / MESED
View on GitHub
[AAAI 2024] MESED: A Multi-modal Entity Set Expansion Dataset with Fine-grained Semantic Classes and Hard Negative Entities
☆15Apr 26, 2024Updated 2 years ago
lbaermann / qaego4d
View on GitHub
Code and Dataset for the CVPRW Paper "Where did I leave my keys? — Episodic-Memory-Based Question Answering on Egocentric Videos"
☆31Aug 28, 2023Updated 2 years ago
Noah888 / DAR
View on GitHub
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
☆15Oct 22, 2024Updated last year
hhc1997 / MSCN
View on GitHub
☆12Mar 28, 2024Updated 2 years ago
bladewaltz1 / PromptSwitch
View on GitHub
☆30Aug 14, 2023Updated 2 years ago
iLearn-Lab / SIGIR24-DQU-CIR
View on GitHub
[SIGIR 2024] - Simple but Effective Raw-Data Level Multimodal Fusion for Composed Image Retrieval
☆44Jul 14, 2024Updated 2 years ago
Delong-liu-bupt / SEN
View on GitHub
[Neural Networks 2025]Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval
☆12Dec 24, 2024Updated last year
HYUNJS / STOV-TAL
View on GitHub
[WACV-2025] Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization
☆17May 28, 2025Updated last year
Zjut-MultimediaPlus / PIR-pytorch
View on GitHub
A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval (MM'23 Oral)
☆15Dec 8, 2023Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Jahawn-Wen / CAMeL-reID
View on GitHub
[IEEE Transactions on Information Forensics and Security'25] Pytorch implementation of CAMeL: Cross-modality Adaptive Meta-Learning for T…
☆17Jan 5, 2026Updated 6 months ago
xianming-gu / AdaFuse
View on GitHub
The official code of ’AdaFuse: Adaptive Medical Image Fusion Based on Spatial-Frequential Cross Attention‘.
☆12Dec 11, 2024Updated last year
eastbrother87 / FAPM_official
View on GitHub
This repository contains the implementation of FAPM (2023 ICASSP).
☆25Jun 19, 2023Updated 3 years ago
Halluminate / westworld
View on GitHub
☆19Mar 7, 2026Updated 4 months ago
josephzpng / DisTime
View on GitHub
DisTime: Distribution-based Time Representation for Video Large Language Models.
☆21Jul 10, 2025Updated last year
cvsp-lab / AgilePruner
View on GitHub
[ICLR 2026] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models
☆28Mar 3, 2026Updated 4 months ago
davidsvy / hard-negative-mixing
View on GitHub
An unofficial PyTorch implementation of the NeurIPS 2020 paper Hard Negative Mixing for Contrastive Learning.
☆20Oct 17, 2022Updated 3 years ago