microsoft/VideoX

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/microsoft/VideoX)

microsoft / VideoX

VideoX: a collection of video cross-modal models

☆1,071

Alternatives and similar repositories for VideoX

Users that are interested in VideoX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

chenjoya / 2dtan
View on GitHub
An optimized re-implementation for 2D-TAN: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language (AAAI…
☆128Apr 1, 2023Updated 3 years ago
JonghwanMun / LGI4temporalgrounding
View on GitHub
Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"
☆132Jul 5, 2021Updated 5 years ago
ikuinen / CMIN_moment_retrieval
View on GitHub
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
☆87Nov 22, 2020Updated 5 years ago
jiyanggao / TALL
View on GitHub
TALL: Temporal Activity Localization via Language Query
☆220Mar 15, 2018Updated 8 years ago
26hzhang / VSLNet
View on GitHub
Span-based Localizing Network for Natural Language Video Localization (ACL 2020)
☆113Oct 15, 2021Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
JaywongWang / CBP
View on GitHub
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware P…
☆59Mar 24, 2023Updated 3 years ago
yytzsy / SCDM
View on GitHub
Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
☆71Sep 7, 2021Updated 4 years ago
niluthpol / weak_supervised_video_moment
View on GitHub
Weakly Supervised Video Moment Retrieval from Text Queries
☆43Jul 20, 2020Updated 6 years ago
sangminwoo / Explore-And-Match
View on GitHub
Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding …
☆42Aug 5, 2022Updated 3 years ago
MCG-NJU / MMN
View on GitHub
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
☆91Nov 16, 2022Updated 3 years ago
OpenGVLab / efficient-video-recognition
View on GitHub
☆184Aug 20, 2022Updated 3 years ago
jayleicn / moment_detr
View on GitHub
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
☆349Mar 9, 2026Updated 4 months ago
MIV-XJTU / ARTrack
View on GitHub
PyTorch implementation of paper "ARTrack" and "ARTrackV2"
☆316Oct 20, 2025Updated 9 months ago
sallymmx / ActionCLIP
View on GitHub
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
☆613Dec 6, 2023Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ju-chen / Efficient-Prompt
View on GitHub
☆197Oct 22, 2022Updated 3 years ago
yytzsy / ABLR_code
View on GitHub
The source code of the paper: "To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression"
☆30Jan 8, 2019Updated 7 years ago
jayleicn / ClipBERT
View on GitHub
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆730Aug 8, 2023Updated 2 years ago
Alvin-Zeng / DRN
View on GitHub
Dense Regression Network for Video Grounding (CVPR2020)
☆53Jan 28, 2021Updated 5 years ago
TalalWasim / Vita-CLIP
View on GitHub
Official repository for "Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting" [CVPR 2023]
☆126Jul 1, 2023Updated 3 years ago
yawenzeng / Awesome-Cross-Modal-Video-Moment-Retrieval
View on GitHub
前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。
☆265Aug 26, 2023Updated 2 years ago
ArrowLuo / CLIP4Clip
View on GitHub
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
☆1,028Apr 12, 2024Updated 2 years ago
yytzsy / grounding_changing_distribution
View on GitHub
☆36Apr 14, 2021Updated 5 years ago
Huntersxsx / RaNet
View on GitHub
source code of our RaNet in EMNLP 2021
☆30May 31, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
jayleicn / TVRetrieval
View on GitHub
[ECCV 2020] PyTorch code for XML on TVRetrieval dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
☆163May 28, 2024Updated 2 years ago
ikuinen / semantic_completion_network
View on GitHub
☆26Aug 4, 2020Updated 5 years ago
r-cui / ViGA
View on GitHub
"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022
☆68Jun 27, 2022Updated 4 years ago
muzairkhattak / ViFi-CLIP
View on GitHub
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
☆309Apr 3, 2024Updated 2 years ago
SCZwangxiao / Temporal-Language-Grounding-in-videos
View on GitHub
Temporal Moment(Action) Localization via Language / Temporal Language Grounding / Video Moment Retrieval
☆100Jan 23, 2022Updated 4 years ago
WuJie1010 / Temporally-language-grounding
View on GitHub
A Pytorch implemention for some state-of-the-art models for" Temporally Language Grounding in Untrimmed Videos"
☆95Sep 21, 2019Updated 6 years ago
antoyang / TubeDETR
View on GitHub
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
☆194Sep 24, 2023Updated 2 years ago
dazhang-cv / MAN
View on GitHub
This is the official repo for "MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment"
☆17May 27, 2019Updated 7 years ago
Huntersxsx / TSGV-Learning-List
View on GitHub
Temporal Sentence Grounding in Videos / Natural Language Video Localization / Video Moment Retrieval的相关工作
☆31Mar 4, 2022Updated 4 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
BonnieHuangxin / SLTA
View on GitHub
ACM ICMR 2019《Cross-Modal Video Moment Retrieval with Spatial and Language-Temporal Attention》
☆36Jun 19, 2019Updated 7 years ago
JJBOY / BMN-Boundary-Matching-Network
View on GitHub
A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generation", which is ac…
☆300Dec 5, 2021Updated 4 years ago
whwu95 / Text4Vis
View on GitHub
【AAAI'2023 & IJCV】Transferring Vision-Language Models for Visual Recognition: A Classifier Perspective
☆199May 30, 2024Updated 2 years ago
Soldelli / VLG-Net
View on GitHub
VLG-Net: Video-Language Graph Matching Networks for Video Grounding
☆31May 31, 2022Updated 4 years ago
farewellthree / STAN
View on GitHub
Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"
☆107Jan 28, 2024Updated 2 years ago
crodriguezo / TMLGA
View on GitHub
Repository of proposal-free temporal moment localization work
☆33Jun 11, 2024Updated 2 years ago
iworldtong / Awesome-Temporal-Sentence-Grounding-in-Videos
View on GitHub
A curated list of grounding natural language in video and related area. :-)
☆82Dec 16, 2019Updated 6 years ago