TengdaHan/AutoAD

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TengdaHan/AutoAD)

TengdaHan / AutoAD

[CVPR'23 Highlight] AutoAD: Movie Description in Context.

☆104

Alternatives and similar repositories for AutoAD

Users that are interested in AutoAD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jyxarthur / AutoAD-Zero
View on GitHub
[ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…
☆30May 16, 2026Updated 2 months ago
Soldelli / MAD
View on GitHub
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
☆176Oct 22, 2023Updated 2 years ago
Lzq5 / Video-Text-Alignment
View on GitHub
☆28Jul 18, 2025Updated last year
JaesungHuh / SimpleDiarization
View on GitHub
Simple diarization model
☆53Jun 13, 2025Updated last year
yuezih / Movie101
View on GitHub
Narrative movie understanding benchmark
☆76Jun 11, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
m-bain / CondensedMovies-chall
View on GitHub
Condensed Movies Challenge 2021
☆22Sep 21, 2022Updated 3 years ago
m-bain / CondensedMovies
View on GitHub
Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]
☆204Sep 21, 2022Updated 3 years ago
TengdaHan / TemporalAlignNet
View on GitHub
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
☆122Oct 9, 2023Updated 2 years ago
solicucu / D3G
View on GitHub
☆15Oct 30, 2023Updated 2 years ago
PardoAlejo / MovieCuts
View on GitHub
Learning to cut end-to-end pretrained modules
☆38Apr 17, 2025Updated last year
waybarrios / guidance-based-video-grounding
View on GitHub
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
☆23Sep 26, 2024Updated last year
guxu313 / TeViS
View on GitHub
☆21Aug 26, 2025Updated 10 months ago
MediaBrain-SJTU / K-Diag
View on GitHub
☆10Aug 20, 2023Updated 2 years ago
haoningwu3639 / SimpleSDM-Video
View on GitHub
A simple and flexible PyTorch implementation of Video StableDiffusion (ZeroScope_v2) based on diffusers.
☆20Feb 15, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mayu-ot / hidden-challenges-MR
View on GitHub
codes for Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
☆20Sep 7, 2020Updated 5 years ago
JaesungHuh / VoxMovies
View on GitHub
Evaluation script for VoxMovies dataset in PyTorch
☆23Jan 12, 2024Updated 2 years ago
hyc2026 / StoryTeller
View on GitHub
☆84Mar 10, 2025Updated last year
dawitmureja / AVE
View on GitHub
This is the official repository for our ECCV 2022 paper titled, "The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assis…
☆54Nov 28, 2022Updated 3 years ago
Becomebright / GroundVQA
View on GitHub
Official PyTorch code of GroundVQA (CVPR'24)
☆63Sep 13, 2024Updated last year
HJYao00 / Side4Video
View on GitHub
☆42Apr 7, 2024Updated 2 years ago
Richard-61 / FineAction
View on GitHub
The official codebase of FineAction dataset. We will update the data and code of our FineAction.
☆24Apr 10, 2025Updated last year
qirui-chen / MultiHop-EgoQA
View on GitHub
[AAAI 2025] Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
☆38May 27, 2025Updated last year
IMCCretrieval / MomentDiff
View on GitHub
MomentDiff: Generative Video Moment Retrieval from Random to Real--NeurIPS 2023
☆80Nov 2, 2023Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
wade3han / normlens
View on GitHub
An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…
☆10May 9, 2024Updated 2 years ago
gyhdog99 / RACRO2
View on GitHub
Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)
☆19Jul 1, 2025Updated last year
TengdaHan / MemDPC
View on GitHub
[ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆167Apr 29, 2021Updated 5 years ago
v-iashin / SparseSync
View on GitHub
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
☆56Jan 29, 2024Updated 2 years ago
facebookresearch / video-distant-supervision
View on GitHub
This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, w…
☆43Feb 21, 2023Updated 3 years ago
Soldelli / Awesome-Temporal-Language-Grounding-in-Videos
View on GitHub
A curated list of grounding natural language in video and related area. :-)
☆105Mar 31, 2022Updated 4 years ago
HuiGuanLab / RaTSG
View on GitHub
This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"
☆13Aug 22, 2025Updated 10 months ago
alexmartin1722 / wikivideo
View on GitHub
WikiVideo: Article Generation from Multiple Videos
☆15Nov 14, 2025Updated 8 months ago
MAGIC-AI4Med / RP3D-Diag
View on GitHub
Code implementation of RP3D-Diag
☆17Nov 25, 2024Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
showlab / MovieSeq
View on GitHub
[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences
☆46Mar 11, 2025Updated last year
m-bain / frozen-in-time
View on GitHub
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
☆377May 19, 2022Updated 4 years ago
vigilant-umbrella / wikiHowUnofficialAPI
View on GitHub
API to extract data from wikiHow
☆18Jul 10, 2021Updated 5 years ago
Jyxarthur / shot-by-shot
View on GitHub
[ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…
☆24May 16, 2026Updated 2 months ago
hwjiang1510 / VQLoC
View on GitHub
(NeurIPS 2023) Open-set visual object query search & localization in long-form videos
☆26Feb 1, 2024Updated 2 years ago
EliM0 / TEXTureControlNet
View on GitHub
Official Implementation for "TEXTure: Text-Guided Texturing of 3D Shapes"
☆21Feb 26, 2024Updated 2 years ago
nirat1606 / OADis
View on GitHub
Official code for "Disentangling Visual Embeddings for Attributes and Objects" Published at CVPR 2022
☆34Aug 4, 2023Updated 2 years ago