yingsen1/UniMD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yingsen1/UniMD)

yingsen1 / UniMD

UniMD: Towards Unifying Moment retrieval and temporal action Detection

☆57

Alternatives and similar repositories for UniMD

Users that are interested in UniMD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

yangle15 / DyFADet-pytorch
View on GitHub
☆32Jul 4, 2024Updated 2 years ago
Dotori-HJ / TE-TAD
View on GitHub
[CVPR 2024] Official implementation of the paper "TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate E…
☆30Jun 26, 2024Updated 2 years ago
josephzpng / DisTime
View on GitHub
DisTime: Distribution-based Time Representation for Video Large Language Models.
☆21Jul 10, 2025Updated last year
dingfengshi / TriDet
View on GitHub
[CVPR2023] Code for the paper, TriDet: Temporal Action Detection with Relative Boundary Modeling
☆219Dec 27, 2023Updated 2 years ago
lntzm / MESM
View on GitHub
The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)
☆32Mar 29, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
huangb23 / VTimeLLM
View on GitHub
[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".
☆295Jun 13, 2024Updated 2 years ago
sming256 / OpenTAD
View on GitHub
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
☆340Jul 14, 2026Updated last week
sming256 / AdaTAD
View on GitHub
[CVPR2024] The official implementation of AdaTAD: End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
☆42Jul 9, 2024Updated 2 years ago
benedettaliberatori / T3AL
View on GitHub
Official implementation of "Test-Time Zero-Shot Temporal Action Localization", CVPR 2024
☆75Sep 11, 2024Updated last year
sudo-Boris / mr-Blip
View on GitHub
Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"
☆95Mar 9, 2025Updated last year
fmu2 / snag_release
View on GitHub
Official Implementation of SnAG (CVPR 2024)
☆59Apr 26, 2025Updated last year
HYUNJS / STOV-TAL
View on GitHub
[WACV-2025] Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization
☆17May 28, 2025Updated last year
yeliudev / R2-Tuning
View on GitHub
🌀 R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
☆91Jul 2, 2024Updated 2 years ago
fletcherjiang / LLMEPET
View on GitHub
[MM'24 Oral] Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval
☆130Aug 23, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
PolyU-ChenLab / ETBench
View on GitHub
👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)
☆74Jan 20, 2025Updated last year
Hon-Wong / ByteVideoLLM
View on GitHub
[ICCV 2025] Dynamic-VLM
☆28Dec 16, 2024Updated last year
minghangz / cnm
View on GitHub
Weakly Supervised Video Moment Localisation with Contrastive Negative Sample Mining
☆31Apr 4, 2022Updated 4 years ago
TimeMarker-LLM / TimeMarker
View on GitHub
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
☆107Nov 28, 2024Updated last year
DCDmllm / Momentor
View on GitHub
☆81Nov 24, 2024Updated last year
yellow-binary-tree / HawkEye
View on GitHub
Official implementation of HawkEye: Training Video-Text LLMs for Grounding Text in Videos
☆47Apr 29, 2024Updated 2 years ago
HengLan / CGSTVG
View on GitHub
[CVPR 2024] Context-Guided Spatio-Temporal Video Grounding
☆66Jun 28, 2024Updated 2 years ago
zhenyingfang / Awesome-Temporal-Action-Detection-Temporal-Action-Proposal-Generation
View on GitHub
Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation
☆591Updated this week
appletea233 / LLaVA-ST
View on GitHub
[CVPR 2025] LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
☆84Jul 4, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
IMCCretrieval / MomentDiff
View on GitHub
MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023
☆80Nov 2, 2023Updated 2 years ago
thearkaprava / MS-Temba
View on GitHub
[CVPR 2026] Official Repository of 'MS-Temba: Multi-Scale Temporal Mamba for Understanding Long Untrimmed Videos'
☆48Jun 22, 2026Updated 3 weeks ago
happyharrycn / actionformer_release
View on GitHub
Code release for ActionFormer (ECCV 2022)
☆570Apr 11, 2024Updated 2 years ago
lgzlIlIlI / Boosting-WTAL
View on GitHub
☆48Sep 22, 2023Updated 2 years ago
zhou745 / GauFuse_WSTAL
View on GitHub
☆21May 8, 2023Updated 3 years ago
sejong-rcv / PVLR
View on GitHub
[ACM MM-24] Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
☆13Oct 8, 2024Updated last year
OpenGVLab / video-mamba-suite
View on GitHub
The suite of modeling video with Mamba
☆295May 14, 2024Updated 2 years ago
chenwei746 / EEVG
View on GitHub
☆23Aug 20, 2024Updated last year
Pilhyeon / BAM-DETR
View on GitHub
Official Pytorch Implementation of 'BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos'
☆36Feb 26, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
asrafulashiq / hamnet
View on GitHub
PyTorch implementation of AAAI 2021 paper: A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
☆42Apr 20, 2021Updated 5 years ago
WHB139426 / Grounded-Video-LLM
View on GitHub
[EMNLP 2025 Findings] Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
☆149Aug 21, 2025Updated 11 months ago
gyxxyg / VTG-LLM
View on GitHub
[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
☆130Dec 10, 2024Updated last year
Zhuo-Cao / FlashVTG
View on GitHub
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding. (WACV2025)
☆39Apr 17, 2025Updated last year
yexf308 / MachineLearning
View on GitHub
Machine Learning Course From Scratch
☆13Jul 24, 2024Updated last year
whwu95 / Cap4Video
View on GitHub
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
☆256Nov 29, 2024Updated last year
tub-rip / event_penguins
View on GitHub
The official implementation of "Low-power, Continuous Remote Behavioral Localization with Event Cameras" (CVPR 2024)
☆13Sep 25, 2024Updated last year