This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, we provide PyTorch code for training and testing as described in the paper. The proposed distant supervision framework achieves strong generalization performance on step classification, recognition of procedural…
☆43Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for video-distant-supervision
Users that are interested in video-distant-supervision are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15May 23, 2023Updated 2 years ago
- Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"☆50Jan 27, 2025Updated last year
- Official code repository for "Video-Mined Task Graphs for Keystep Recognition in Instructional Videos" arXiv, 2023☆14Apr 1, 2024Updated 2 years ago
- [CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.☆119Oct 9, 2023Updated 2 years ago
- Code implementation for our ECCV, 2022 paper titled "My View is the Best View: Procedure Learning from Egocentric Videos"☆34Feb 5, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆24Aug 19, 2024Updated last year
- [CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"☆55Aug 8, 2023Updated 2 years ago
- Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos☆32Sep 9, 2024Updated last year
- A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…☆29Jan 15, 2022Updated 4 years ago
- ☆19May 2, 2020Updated 5 years ago
- [ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing☆27Jul 15, 2022Updated 3 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos☆25Mar 20, 2024Updated 2 years ago
- Java/python library and validator for the AIDA Interchange Format (AIF). Originally based on isi-vista/gaia-interchange.☆21Jun 14, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)☆108Jan 23, 2025Updated last year
- [CVPR2022] SVIP: Sequence VerIfication for Procedures in Videos☆24Feb 24, 2023Updated 3 years ago
- Code for recreating the HoS benchmark of VISOR☆23Jul 2, 2023Updated 2 years ago
- Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers. Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M…☆44Sep 11, 2024Updated last year
- Animals3D: Learning Articulated Shape with Keypoint Pseudo-labels from Web Images (CVPR 2023)☆14May 20, 2024Updated last year
- GPU-accelerated video decoder☆20May 18, 2021Updated 4 years ago
- Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024☆58Aug 19, 2025Updated 7 months ago
- Code implementation of the paper 'ExpertAF: Expert Actionable Feedback from Video'☆14Sep 30, 2025Updated 6 months ago
- [ECCV 22] LocVTP: Video-Text Pre-training for Temporal Localization☆39Jul 29, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 4 months ago
- MERLOT: Multimodal Neural Script Knowledge Models☆226Mar 15, 2022Updated 4 years ago
- FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods (ICCV 2023)☆21Feb 24, 2026Updated last month
- ☆107Apr 11, 2022Updated 4 years ago
- (CVPR 2023) Official implemention of the paper "Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos…☆31Apr 2, 2024Updated 2 years ago
- Code for CVPR2023 paper "Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies"☆18Mar 21, 2023Updated 3 years ago
- Code for ''A Simple Baseline for Audio-Visual Scene-Aware Dialog``☆27May 26, 2020Updated 5 years ago
- fork from https://github.com/jwyang/faster-rcnn.pytorch☆10Aug 6, 2018Updated 7 years ago
- [CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos☆102Oct 30, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for the VOST dataset☆26Oct 1, 2023Updated 2 years ago
- The benchmark for "Video Object Segmentation in Panoptic Wild Scenes".☆12Oct 17, 2023Updated 2 years ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆28Nov 29, 2023Updated 2 years ago
- The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.☆12Oct 15, 2021Updated 4 years ago
- Official repository for the MMFM challenge☆25Jun 18, 2024Updated last year
- This is the repo for Multi-level textual grounding☆34Jul 21, 2020Updated 5 years ago
- [CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos☆12Sep 24, 2024Updated last year