Jyxarthur/AutoAD-Zero

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Jyxarthur/AutoAD-Zero)

Jyxarthur / AutoAD-Zero

[ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, Max Bain, Arsha Nagrani, Gül Varol, Weidi Xie, Andrew Zisserman

☆31

Alternatives and similar repositories for AutoAD-Zero

Users that are interested in AutoAD-Zero are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jyxarthur / shot-by-shot
View on GitHub
[ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…
☆24May 16, 2026Updated 2 months ago
TengdaHan / TemporalAlignNet
View on GitHub
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
☆122Oct 9, 2023Updated 2 years ago
Jyxarthur / OCLR_model
View on GitHub
[NeurIPS 2022] Segmenting Moving Objects via an Object-Centric Representation. Junyu Xie, Weidi Xie, Andrew Zisserman.
☆32Dec 20, 2023Updated 2 years ago
TengdaHan / slurm_web
View on GitHub
Website-based resource monitor for Slurm system
☆39Apr 6, 2023Updated 3 years ago
TengdaHan / integrated_thesis_template
View on GitHub
Latex template for Oxford integrated thesis
☆20Apr 7, 2025Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
JaesungHuh / SimpleDiarization
View on GitHub
Simple diarization model
☆53Jun 13, 2025Updated last year
chrirupp / cv_course
View on GitHub
☆17Jan 6, 2026Updated 6 months ago
oxai / visogender
View on GitHub
☆13May 10, 2025Updated last year
Jyxarthur / appear-refine
View on GitHub
[ECCV 2024] Official Implementation of "Appearance-Based Refinement for Object-Centric Motion Segmentation" Junyu Xie, Weidi Xie, Andrew …
☆13Oct 23, 2024Updated last year
facebookresearch / ego4d-goalstep
View on GitHub
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities (NeurIPS 2023)
☆61Apr 15, 2024Updated 2 years ago
DeepLearn-lab / modules-cloud
View on GitHub
Content for cloud computing workshop
☆15Apr 20, 2018Updated 8 years ago
Lzq5 / Video-Text-Alignment
View on GitHub
☆28Jul 18, 2025Updated last year
vivoutlaw / tcbp
View on GitHub
Temporal Compact Bilinear Pooling (TCBP)
☆11May 27, 2020Updated 6 years ago
deeplsd / Merkel-Podcast-Corpus
View on GitHub
This dataset is presented in the paper Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video…
☆12Sep 21, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
visinf / fast-axiomatic-attribution
View on GitHub
Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)
☆15Feb 24, 2026Updated 5 months ago
visinf / funnybirds-framework
View on GitHub
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)
☆17Apr 8, 2024Updated 2 years ago
TengdaHan / MemDPC
View on GitHub
[ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆167Apr 29, 2021Updated 5 years ago
Tiago-Roxo / WASD
View on GitHub
☆20Updated this week
Soldelli / MAD
View on GitHub
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
☆176Oct 22, 2023Updated 2 years ago
ChristophReich1996 / Yeast-in-Microstructures-Dataset
View on GitHub
Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 20…
☆14Feb 21, 2024Updated 2 years ago
tsb0601 / MultiMon
View on GitHub
☆25Jun 22, 2023Updated 3 years ago
JiwanChung / tapm
View on GitHub
☆11Dec 8, 2022Updated 3 years ago
visinf / self-adaptive
View on GitHub
Semantic Self-adaptation: Enhancing Generalization with a Single Sample (TMLR 2023)
☆18Jul 21, 2023Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
notwaldorf / old-research-papers
View on GitHub
Old Reinforcement Learning research from university
☆10Jan 4, 2017Updated 9 years ago
yuezih / Movie101
View on GitHub
Narrative movie understanding benchmark
☆76Jun 11, 2025Updated last year
josauder / coralscapes_to_3d
View on GitHub
Quick and dirty demo showing how to project semantic segmentation from Coralscapes into 3D point clouds!
☆15Apr 16, 2026Updated 3 months ago
YuanJianhao508 / LikePhys
View on GitHub
[ICLR2026] LikePhys, a training-free method that evaluates intuitive physics in video diffusion models by distinguishing physically valid…
☆16Mar 5, 2026Updated 4 months ago
visinf / primaps
View on GitHub
Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals (TMLR 2024)
☆19Nov 27, 2024Updated last year
hyc2026 / StoryTeller
View on GitHub
☆84Mar 10, 2025Updated last year
saakur / EventSegmentation
View on GitHub
Code for CVPR 2019 paper
☆12Apr 26, 2019Updated 7 years ago
OpenNLPLab / Vicinity-Vision-Transformer
View on GitHub
[TPAMI 2023] This is an official implementation for "Vicinity Vision Transformer".
☆22Jun 15, 2023Updated 3 years ago
JaesungHuh / VoxMovies
View on GitHub
Evaluation script for VoxMovies dataset in PyTorch
☆23Jan 12, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
visinf / veto
View on GitHub
Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)
☆22Mar 23, 2026Updated 4 months ago
MCG-NJU / JoMoLD
View on GitHub
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
☆27Jul 15, 2022Updated 4 years ago
bofang98 / UATVR
View on GitHub
[ICCV'23] UATVR: Uncertainty-Adaptive Text-Video Retrieval
☆13Nov 5, 2023Updated 2 years ago
jamespark3922 / lsmdc-baseline
View on GitHub
☆14Aug 16, 2019Updated 6 years ago
PardoAlejo / MovieCuts
View on GitHub
Learning to cut end-to-end pretrained modules
☆38Apr 17, 2025Updated last year
Exgc / AVMuST-TED
View on GitHub
☆24Mar 30, 2024Updated 2 years ago
Yifan-Gao / open_retrieval_conversational_machine_reading
View on GitHub
Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset
☆13Nov 19, 2022Updated 3 years ago