[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
☆119Oct 9, 2023Updated 2 years ago
Alternatives and similar repositories for TemporalAlignNet
Users that are interested in TemporalAlignNet are comparing it to the libraries listed below
Sorting:
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 3 years ago
- [ECCV 22] LocVTP: Video-Text Pre-training for Temporal Localization☆39Jul 29, 2022Updated 3 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆219Jul 5, 2022Updated 3 years ago
- [AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding☆91Nov 16, 2022Updated 3 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆43Jul 15, 2022Updated 3 years ago
- Align and Prompt: Video-and-Language Pre-training with Entity Prompts☆188May 1, 2025Updated 9 months ago
- Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-based Active Learning☆15Dec 12, 2023Updated 2 years ago
- ☆193Oct 22, 2022Updated 3 years ago
- S3D Text-Video model trained on HowTo100M using MIL-NCE☆200Jul 3, 2020Updated 5 years ago
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆28Jan 28, 2025Updated last year
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆127Sep 29, 2023Updated 2 years ago
- This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, w…☆43Feb 21, 2023Updated 3 years ago
- [EMNLP'22] Weakly-Supervised Temporal Article Grounding☆14Nov 25, 2023Updated 2 years ago
- Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)☆37Nov 5, 2021Updated 4 years ago
- Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]☆380May 19, 2022Updated 3 years ago
- [CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos☆101Oct 30, 2022Updated 3 years ago
- ☆12Mar 12, 2023Updated 2 years ago
- MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions☆172Oct 22, 2023Updated 2 years ago
- Code release for "Learning Video Representations from Large Language Models"☆536Oct 1, 2023Updated 2 years ago
- Code for the HowTo100M paper☆293Mar 10, 2020Updated 5 years ago
- Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"☆117Jun 9, 2021Updated 4 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- [CVPR2022] Unsupervised Pre-training for Temporal Action Localization Tasks (UP-TAL)☆29Mar 9, 2022Updated 3 years ago
- TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks (ICCVW 2021)☆117Sep 16, 2023Updated 2 years ago
- CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency☆18Aug 10, 2022Updated 3 years ago
- [ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆167Apr 29, 2021Updated 4 years ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆39Feb 17, 2023Updated 3 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆90Sep 6, 2023Updated 2 years ago
- ☆36Apr 14, 2021Updated 4 years ago
- [NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆289Oct 10, 2021Updated 4 years ago
- An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"☆364Jul 25, 2024Updated last year
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.☆103Nov 6, 2024Updated last year
- ☆27Jul 18, 2025Updated 7 months ago
- [ACL 2021] mTVR: Multilingual Video Moment Retrieval☆27Aug 20, 2022Updated 3 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)☆32May 15, 2023Updated 2 years ago
- MERLOT: Multimodal Neural Script Knowledge Models☆225Mar 15, 2022Updated 3 years ago