facebookresearch / video-distant-supervisionLinks

This is an official pytorch implementation of Learning To Recognize Procedural Activities with Distant Supervision. In this repository, we provide PyTorch code for training and testing as described in the paper. The proposed distant supervision framework achieves strong generalization performance on step classification, recognition of procedural…

☆42

Alternatives and similar repositories for video-distant-supervision

Users that are interested in video-distant-supervision are comparing it to the libraries listed below

Sorting:

klauscc / VindLU
☆108Updated 2 years ago
LisaAnne / TemporalLanguageRelease
☆43Updated 4 years ago
antoyang / just-ask
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
☆123Updated last year
j-min / HiREST
Hierarchical Video-Moment Retrieval and Step-Captioning (CVPR 2023)
☆102Updated 5 months ago
TengdaHan / TemporalAlignNet
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
☆118Updated last year
showlab / Region_Learner
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆42Updated 3 years ago
medhini / Instructional-Video-Summarization
Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022
☆38Updated 2 years ago
jayleicn / singularity
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆135Updated 2 years ago
microsoft / LAVENDER
A Unified Framework for Video-Language Understanding
☆57Updated 2 years ago
tsujuifu / pytorch_empirical-mvm
A PyTorch implementation of EmpiricalMVM
☆41Updated last year
salesforce / paprika
Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
☆49Updated 5 months ago
StanLei52 / GEBD
[ICCV2021] Generic Event Boundary Detection: A Benchmark for Event Segmentation
☆69Updated 3 years ago
Soldelli / MAD
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
☆165Updated last year
Chuhanxx / Temporal_Query_Networks
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding
☆62Updated 3 years ago
rxtan2 / video-grounding-narrations
☆12Updated 2 years ago
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
☆115Updated 2 years ago
antoine77340 / RareAct
RareAct: A video dataset of unusual interactions
☆32Updated 4 years ago
TXH-mercury / COSA
[ICLR2024] Codes and Models for COSA: Concatenated Sample Pretrained Vision-Language Foundation Model
☆43Updated 6 months ago
airsplay / vimpac
☆73Updated 3 years ago
TheShadow29 / VidSitu
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
☆60Updated 3 years ago
zjr2000 / LLMVA-GEBC
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
☆29Updated last year
FingerRec / OA-Transformer
[CVPR 2022] The code for our paper 《Object-aware Video-language Pre-training for Retrieval》
☆62Updated 3 years ago
tzhhhh123 / HC-STVG
The HC-STVG Dataset
☆56Updated 2 years ago
brown-palm / AntGPT
Official code implemtation of paper AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
☆22Updated 9 months ago
google-research-datasets / Video-Timeline-Tags-ViTT
A collection of videos annotated with timelines where each video is divided into segments, and each segment is labelled with a short free…
☆26Updated 3 years ago
ShiYaya / emscore
Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"
☆26Updated 2 years ago
medhini / clip_it
CLIP-It! Language-Guided Video Summarization
☆74Updated 4 years ago
VALUE-Leaderboard / DataRelease
Data Release for VALUE Benchmark
☆31Updated 3 years ago
antoyang / TubeDETR
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
☆182Updated last year
RERV / UniAdapter
[ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …
☆74Updated last year