xxayt/MGSV

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xxayt/MGSV)

xxayt / MGSV

[ICCV 2025] This repo is the official implementation of "Music Grounding by Short Video"

☆27

Alternatives and similar repositories for MGSV

Users that are interested in MGSV are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

adxcreative / D-M
View on GitHub
The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…
☆10Feb 9, 2025Updated last year
jyliu-98 / MoSketch
View on GitHub
[ICCV 2025] This repo is the official implementation of "Multi-Object Sketch Animation by Scene Decomposition and Motion Planning"
☆28Jul 30, 2025Updated 11 months ago
patrick-0817 / T-MASS-dataleakage
View on GitHub
☆10Nov 27, 2024Updated last year
kevinliang888 / IVR-QA-baselines
View on GitHub
[ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers
☆20Apr 16, 2024Updated 2 years ago
ruc-aimc-lab / TeachCLIP
View on GitHub
[CVPR 2024] TeachCLIP for Text-to-Video Retrieval
☆42May 7, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
EIT-NLP / HiDrop
View on GitHub
☆17Apr 5, 2026Updated 3 months ago
HKUST-LongGroup / DyME
View on GitHub
[ICLR 2026] Empowering Small VLMs to Think with Dynamic Memorization and Exploration
☆18Mar 18, 2026Updated 4 months ago
HKUST-LongGroup / GIR-Bench
View on GitHub
[ICLR 2026] GIR-Bench: Versatile Benchmark for Generating Images with Reasoning
☆34Jan 27, 2026Updated 5 months ago
adxcreative / COPE
View on GitHub
☆15Dec 20, 2024Updated last year
banjiuyufen / ArXiv-Agent
View on GitHub
🕵️ ArXiv Agent v1.0 - Your Intelligent Research Assistant
☆27Dec 29, 2025Updated 6 months ago
WeiQijie / retinal-lesions
View on GitHub
☆31Oct 11, 2020Updated 5 years ago
uvavision / DrillDown
View on GitHub
[NeurIPS 2019] Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries
☆12Apr 15, 2022Updated 4 years ago
VisualAIKHU / Missing-AVQA
View on GitHub
Official Repository for "Learning Trimodal Relation for Audio-Visual Question Answering with Missing Modality" (ECCV 2024)
☆16Oct 29, 2024Updated last year
ispamm / FolAI
View on GitHub
Stable-V2A: Synthesis of Synchronized Sound Effect with Temporal and Semantic Controls
☆18May 27, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
neoglez / calvis
View on GitHub
calvis: Chest, wAist and peLVIS circumference from 3D human Body meshes for Deep Learning.
☆14Jul 16, 2026Updated last week
rhfeiyang / VideoSketcher
View on GitHub
Official implementation of "VideoSketcher: Video Models Prior Enable Versatile Sequential Sketch Generation"
☆15Apr 7, 2026Updated 3 months ago
jnwnlee / video-foley
View on GitHub
Official implementation of "Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound". IEEE TASLP 20…
☆19Feb 27, 2026Updated 4 months ago
maifoundations / Streamo
View on GitHub
Streaming Video Instruction Tuning
☆79Feb 25, 2026Updated 4 months ago
zzhbrr / CMU15445-2022-notes
View on GitHub
My notes for cmu15445 2022
☆14Feb 8, 2023Updated 3 years ago
zinuoli / TriSense
View on GitHub
[NeurIPS 2025] Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
☆27Feb 10, 2026Updated 5 months ago
DreamerCCC / CutFreq
View on GitHub
☆12Feb 22, 2024Updated 2 years ago
jasongief / OV-AVEL
View on GitHub
[2025 CVPR] Towards Open-Vocabulary Audio-Visual Event Localization
☆46Mar 7, 2025Updated last year
Wuuu3511 / LAMVSNET
View on GitHub
Boosting Multi-view Stereo with Late Cost Aggregation
☆13Jan 24, 2024Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
thechargedneutron / ExpertAF
View on GitHub
Code implementation of the paper 'ExpertAF: Expert Actionable Feedback from Video'
☆17Sep 30, 2025Updated 9 months ago
tuyunbin / Review-of-Change-Captioning
View on GitHub
This repository offers a comprehensive overview of existing datasets and methods in the field of change captioning.
☆17Sep 2, 2025Updated 10 months ago
zaiquanyang / LLaVA_Next_STVG
View on GitHub
LLaVA-Next for STVG
☆21Dec 5, 2025Updated 7 months ago
nusnlp / d2vlm
View on GitHub
[ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models
☆24Apr 18, 2026Updated 3 months ago
Tree-Shu-Zhao / RebQ.pytorch
View on GitHub
This is the official code for the paper "Reconstruct before Query: Continual Missing Modality Learning with Decomposed Prompt Collaborati…
☆12Aug 13, 2024Updated last year
DFKI-NLP / REval
View on GitHub
[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction
☆13Apr 21, 2020Updated 6 years ago
sunoh-kim / pps
View on GitHub
Pytorch implementation of the paper 'Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Super…
☆19Jan 19, 2024Updated 2 years ago
lezhang7 / MOQAGPT
View on GitHub
[EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs
☆13Dec 28, 2024Updated last year
qwerwsc1 / CT-MVSNet
View on GitHub
[MMM‘24 Oral]CT-MVSNet: Efficient Multi-View Stereo with Cross-scale Transformer
☆18Apr 18, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Tangkfan / Awesome-Temporal-Video-Grounding
View on GitHub
paper list on Video Moment Retrieval (VMR), or Temporal Video Grounding (TVG), Video Grounding (VG), or Temporal Sentence Grounding in Vi…
☆43Dec 27, 2025Updated 6 months ago
zchoi / GLSCL
View on GitHub
[TIP25] Code for "Text-Video Retrieval with Global-Local Semantic Consistent Learning"
☆16May 12, 2025Updated last year
lihongzhao99 / MMDG_Benchmark
View on GitHub
The official implementation of the paper "Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study"
☆15May 8, 2026Updated 2 months ago
yoxu515 / VIPOSeg-Benchmark
View on GitHub
The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".
☆12Oct 17, 2023Updated 2 years ago
murufeng / knowledge_distillation
View on GitHub
一款即插即用的知识蒸馏工具包
☆13May 16, 2022Updated 4 years ago
FingerRec / OA-Transformer
View on GitHub
[CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》
☆61May 25, 2022Updated 4 years ago
PyOpenTS / PyOpenTS
View on GitHub
Efficient and User-Friendly Time Series Analysis Library for PyOpenTS with pytorch compatibility.
☆16Aug 14, 2023Updated 2 years ago