This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, we provide PyTorch code for training and testing as described in the paper. The proposed distant supervision framework achieves strong generalization performance on step classification, recognition of procedural…
☆43Feb 21, 2023Updated 3 years ago
Alternatives and similar repositories for video-distant-supervision
Users that are interested in video-distant-supervision are comparing it to the libraries listed below
Sorting:
- Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"☆50Jan 27, 2025Updated last year
- ☆15May 23, 2023Updated 2 years ago
- Code implementation for our ECCV, 2022 paper titled "My View is the Best View: Procedure Learning from Egocentric Videos"☆34Feb 5, 2024Updated 2 years ago
- [CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.☆119Oct 9, 2023Updated 2 years ago
- [CVPR 2022] Sequential Voting with Relational Box Fields for Active Object Detection☆10Jun 19, 2022Updated 3 years ago
- [ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing☆27Jul 15, 2022Updated 3 years ago
- A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…☆29Jan 15, 2022Updated 4 years ago
- Java/python library and validator for the AIDA Interchange Format (AIF). Originally based on isi-vista/gaia-interchange.☆21Jun 14, 2023Updated 2 years ago
- MIMIC: Masked Image Modeling with Image Correspondences☆17Jun 14, 2024Updated last year
- Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)☆107Jan 23, 2025Updated last year
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- Prompt Generation Networks for Input-Space Adaptation of Frozen Vision Transformers. Jochem Loedeman, Maarten C. Stol, Tengda Han, Yuki M…☆44Sep 11, 2024Updated last year
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- ☆19May 2, 2020Updated 5 years ago
- Code for recreating the HoS benchmark of VISOR☆22Jul 2, 2023Updated 2 years ago
- HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos☆24Mar 20, 2024Updated last year
- [CVPR'23 Highlight] AutoAD: Movie Description in Context.☆103Nov 6, 2024Updated last year
- [CVPR2022] SVIP: Sequence VerIfication for Procedures in Videos☆24Feb 24, 2023Updated 3 years ago
- GPU-accelerated video decoder☆20May 18, 2021Updated 4 years ago
- Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos☆29Sep 9, 2024Updated last year
- [ACCV 2024] Official Implementation of "AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description". Junyu Xie, Tengda Han, M…☆28Jan 28, 2025Updated last year
- (CVPR 2023) Official implemention of the paper "Weakly Supervised Video Representation Learning with Unaligned Text for Sequential Videos…☆31Apr 2, 2024Updated last year
- ☆107Apr 11, 2022Updated 3 years ago
- [CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos☆102Oct 30, 2022Updated 3 years ago
- Official repository for the MMFM challenge☆25Jun 18, 2024Updated last year
- Code for the VOST dataset☆26Oct 1, 2023Updated 2 years ago
- Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models☆27Nov 29, 2023Updated 2 years ago
- ☆29Jun 15, 2022Updated 3 years ago
- The MECCANO Dataset: official repository in which we provide code and models.☆32Jul 31, 2023Updated 2 years ago
- ☆32Jul 17, 2024Updated last year
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Feb 21, 2022Updated 4 years ago
- [CVPR 2022] Egocentric Action Target Prediction in 3D☆32Dec 2, 2025Updated 3 months ago
- code for the ECCV '20 paper "Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval"☆202Apr 1, 2021Updated 4 years ago
- Official Implementation of "Interpretable 3D Neural Object Volumes for Robust Conceptual Reasoning." ICLR 2026.☆30Feb 3, 2026Updated last month
- Website-based resource monitor for Slurm system☆37Apr 6, 2023Updated 2 years ago
- ☆36Jul 9, 2025Updated 7 months ago
- [AAAI 2022 Oral] This is a Pytorch implementation of the AAAI 2022 paper "Cross-Domain Empirical Risk Minimization for Unbiased Long-tail…☆33Feb 17, 2022Updated 4 years ago
- Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers☆233Jun 13, 2022Updated 3 years ago
- Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022☆39Feb 17, 2023Updated 3 years ago