THUNLP-MT/MUSEG

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/THUNLP-MT/MUSEG)

THUNLP-MT / MUSEG

Repo for paper "MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding".

☆40

Alternatives and similar repositories for MUSEG

Users that are interested in MUSEG are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

xiaomi-research / time-r1
View on GitHub
[NeurIPS'25] Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
☆93Dec 14, 2025Updated 5 months ago
nusnlp / d2vlm
View on GitHub
[ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models
☆24Apr 18, 2026Updated last month
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 8 months ago
minghangz / TFVTG
View on GitHub
☆51Sep 13, 2024Updated last year
V-STaR-Bench / V-STaR
View on GitHub
Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
☆43Mar 2, 2026Updated 2 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
renjie-liang / HUAL
View on GitHub
Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning
☆15Dec 12, 2023Updated 2 years ago
THUNLP-MT / CODIS
View on GitHub
Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".
☆12Oct 14, 2024Updated last year
minjoong507 / BM-DETR
View on GitHub
[WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"
☆16Feb 24, 2025Updated last year
houzhijian / CONE
View on GitHub
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
☆31Aug 5, 2023Updated 2 years ago
lntzm / MESM
View on GitHub
The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)
☆32Mar 29, 2024Updated 2 years ago
Tanveer81 / ReVisionLLM
View on GitHub
This is the official implementation of ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
☆46Nov 5, 2025Updated 6 months ago
josephzpng / DisTime
View on GitHub
DisTime: Distribution-based Time Representation for Video Large Language Models.
☆20Jul 10, 2025Updated 10 months ago
zjucsq / PLA
View on GitHub
[ICLR2023] Video Scene Graph Generation from Single-Frame Weak Supervision
☆12Sep 17, 2023Updated 2 years ago
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆132Apr 27, 2026Updated 3 weeks ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
yongliang-wu / NumPro
View on GitHub
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
☆149Jan 19, 2026Updated 4 months ago
yellow-binary-tree / HawkEye
View on GitHub
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
☆47Apr 29, 2024Updated 2 years ago
afcedf / SOONet
View on GitHub
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
☆29Jun 24, 2024Updated last year
Jayce1kk / SpaceVLLM
View on GitHub
SpaceVLLM: Endowing Multimodal Large Language Model with Spatio-Temporal Video Grounding Capability
☆17May 8, 2025Updated last year
winni18 / MC-DML
View on GitHub
☆13Mar 2, 2025Updated last year
Andy-xiaokang / AP1400-2
View on GitHub
AP1400-2
☆10Aug 5, 2024Updated last year
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
gyxxyg / TRACE
View on GitHub
[ICLR 2025] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
☆156Aug 22, 2025Updated 8 months ago
mlvlab / VidChain
View on GitHub
Official Implementation (Pytorch) of the "VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Capti…
☆24Jan 26, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
www-Ye / Time-R1
View on GitHub
R1-like Video-LLM for Temporal Grounding
☆135Jun 20, 2025Updated 11 months ago
XMUDeepLIT / TTCS
View on GitHub
The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.
☆47Apr 22, 2026Updated 3 weeks ago
THUNLP-MT / EscapeCraft
View on GitHub
Official repo for EscapeCraft (an 3D environment for room escape) and benchmark MM-Escape. This work is accepted by ICCV 2025.
☆39Jul 7, 2025Updated 10 months ago
aim-uofa / PerturboLLaVA
View on GitHub
☆17Apr 20, 2025Updated last year
Evangelion09 / GuidedNet
View on GitHub
code for GuidedNet
☆13Feb 16, 2023Updated 3 years ago
Tanveer81 / RGNet
View on GitHub
This is the official implementation of RGNet: A Unified Retrieval and Grounding Network for Long Videos
☆19Mar 3, 2025Updated last year
VisualAIKHU / Keyword-DETR
View on GitHub
Official Repository for "Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection" (AAAI …
☆15Mar 1, 2025Updated last year
facebookresearch / HierVL
View on GitHub
[CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings
☆46Aug 14, 2023Updated 2 years ago
THUNLP-MT / SKR
View on GitHub
Self-Knowledge Guided Retrieval Augmentation for Large Language Models (EMNLP Findings 2023)
☆28Dec 8, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Master-Tan / BUAA_OS
View on GitHub
北航6系2022学期OS课 BUAA_OS
☆11Sep 13, 2022Updated 3 years ago
sjtu-medialab / Free-Viewpoint-RGB-D-Video-Dataset
View on GitHub
A New Free Viewpoint RGB-D Video Dataset
☆13Jan 29, 2024Updated 2 years ago
yeliudev / R2-Tuning
View on GitHub
🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
☆92Jul 2, 2024Updated last year
TimeMarker-LLM / TimeMarker
View on GitHub
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆106Nov 28, 2024Updated last year
NagabhushanSN95 / DIBR-BD
View on GitHub
Unofficial implementation of the paper: Cho et al., "Hole Filling Method for Depth Image Based Rendering Based on Boundary Decision", SPL…
☆12Sep 17, 2022Updated 3 years ago
solor-wind / BUAA_OO_TEST
View on GitHub
BUAA OO课程的评测机
☆14Jun 7, 2024Updated last year
ml-research / deictic-segment-anything
View on GitHub
Segment Anything with Deictic Prompting
☆27May 13, 2025Updated last year