facebookresearch / video-distant-supervision
This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, we provide PyTorch code for training and testing as described in the paper. The proposed distant supervision framework achieves strong generalization performance on step classification, recognition of procedural…
☆39Updated last year
Related projects: ⓘ
- [CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)☆57Updated 3 years ago
- A PyTorch implementation of EmpiricalMVM☆39Updated 9 months ago
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Updated 2 years ago
- The Pytorch implementation for "Video-Text Pre-training with Learned Regions"☆42Updated 2 years ago
- ☆74Updated 2 years ago
- ☆41Updated 3 years ago
- RareAct: A video dataset of unusual interactions☆32Updated 4 years ago
- [CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》☆61Updated 2 years ago
- ☆12Updated last year
- Research code for "Training Vision-Language Transformers from Captions Alone"☆34Updated 2 years ago
- ☆100Updated last year
- A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.☆37Updated 2 years ago
- Data Release for VALUE Benchmark☆32Updated 2 years ago
- [ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos☆117Updated 11 months ago
- ☆22Updated last year
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆62Updated 2 years ago
- A Unified Framework for Video-Language Understanding☆55Updated last year
- Starter Code for VALUE benchmark☆79Updated 2 years ago
- Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)☆29Updated 8 months ago
- ☆25Updated last year
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆112Updated 2 years ago
- [Arxiv2022] Revitalize Region Feature for Democratizing Video-Language Pre-training☆21Updated 2 years ago
- Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"☆46Updated last year
- [Findings of EMNLP 2022] AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant☆23Updated last year
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Updated last year
- Code for Learning to Learn Language from Narrated Video☆33Updated 11 months ago
- Video-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)☆14Updated 5 months ago
- The HC-STVG Dataset☆53Updated last year
- Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight☆32Updated last year
- [ICCV2021] Generic Event Boundary Detection: A Benchmark for Event Segmentation☆68Updated 2 years ago