antoine77340/MIL-NCE_HowTo100M

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/antoine77340/MIL-NCE_HowTo100M)

antoine77340 / MIL-NCE_HowTo100M

PyTorch GPU distributed training code for MIL-NCE HowTo100M

☆221

Alternatives and similar repositories for MIL-NCE_HowTo100M

Users that are interested in MIL-NCE_HowTo100M are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

antoine77340 / S3D_HowTo100M
View on GitHub
S3D Text-Video model trained on HowTo100M using MIL-NCE
☆200Jul 3, 2020Updated 6 years ago
antoine77340 / howto100m
View on GitHub
Code for the HowTo100M paper
☆303Mar 10, 2020Updated 6 years ago
simon-ging / coot-videotext
View on GitHub
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
☆291Sep 6, 2022Updated 3 years ago
LuoweiZhou / YouCook2-Leaderboard
View on GitHub
A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.
☆41Jun 29, 2022Updated 4 years ago
TengdaHan / CoCLR
View on GitHub
[NeurIPS'20] Self-supervised Co-Training for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.
☆288Oct 10, 2021Updated 4 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
microsoft / UniVL
View on GitHub
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
☆366Jul 25, 2024Updated last year
m-bain / frozen-in-time
View on GitHub
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
☆376May 19, 2022Updated 4 years ago
antoine77340 / video_feature_extractor
View on GitHub
Easy to use video deep features extractor
☆322Jul 5, 2020Updated 6 years ago
HumamAlwassel / XDC
View on GitHub
Self-Supervised Learning by Cross-Modal Audio-Video Clustering (NeurIPS 2020)
☆91Oct 24, 2022Updated 3 years ago
BestJuly / IIC
View on GitHub
Official implementation of ACMMM'20 paper 'Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework'
☆111Mar 22, 2021Updated 5 years ago
TengdaHan / TemporalAlignNet
View on GitHub
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
☆122Oct 9, 2023Updated 2 years ago
jayleicn / ClipBERT
View on GitHub
[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning…
☆730Aug 8, 2023Updated 2 years ago
linjieli222 / HERO
View on GitHub
Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
☆235Sep 16, 2021Updated 4 years ago
cshizhe / hgr_v2t
View on GitHub
Code accompanying the paper "Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning".
☆211Jun 12, 2020Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jayleicn / singularity
View on GitHub
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136May 5, 2023Updated 3 years ago
MichiganCOG / Video-Grounding-from-Text
View on GitHub
Source code for "Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction"
☆47Jun 22, 2024Updated 2 years ago
DmZhukov / CrossTask
View on GitHub
☆97Feb 14, 2022Updated 4 years ago
tsujuifu / pytorch_violet
View on GitHub
A PyTorch implementation of VIOLET
☆138Dec 17, 2023Updated 2 years ago
medhini / Instructional-Video-Summarization
View on GitHub
Code for paper, "TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency" ECCV 2022
☆39Feb 17, 2023Updated 3 years ago
jayleicn / TVCaption
View on GitHub
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
☆91Sep 6, 2023Updated 2 years ago
laura-wang / video-pace
View on GitHub
code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction
☆100May 13, 2021Updated 5 years ago
VALUE-Leaderboard / DataRelease
View on GitHub
Data Release for VALUE Benchmark
☆30Feb 16, 2022Updated 4 years ago
jayleicn / recurrent-transformer
View on GitHub
[ACL 2020] PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
☆170Dec 4, 2020Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
albanie / collaborative-experts
View on GitHub
Video embeddings for retrieval with natural language queries
☆344Feb 15, 2023Updated 3 years ago
airsplay / vimpac
View on GitHub
☆73Jun 3, 2022Updated 4 years ago
showlab / Region_Learner
View on GitHub
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
☆43Jul 15, 2022Updated 4 years ago
gabeur / mmt
View on GitHub
Multi-Modal Transformer for Video Retrieval
☆265Oct 9, 2024Updated last year
facebookresearch / AVID-CMA
View on GitHub
Audio Visual Instance Discrimination with Cross-Modal Agreement
☆133Aug 13, 2021Updated 4 years ago
jalayrac / weakactionloc
View on GitHub
Code release of our NeurIPS 18 paper "A flexible model for training action localization with varying levels of supervision"
☆16Dec 28, 2018Updated 7 years ago
rowanz / merlot
View on GitHub
MERLOT: Multimodal Neural Script Knowledge Models
☆226Mar 15, 2022Updated 4 years ago
tridivb / slowfast_feature_extractor
View on GitHub
Feature Extractor module for videos using the PySlowFast framework
☆80Apr 22, 2021Updated 5 years ago
jimmy646 / violin
View on GitHub
Data and code for CVPR 2020 paper: "VIOLIN: A Large-Scale Dataset for Video-and-Language Inference"
☆161Apr 29, 2020Updated 6 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
JonghwanMun / LGI4temporalgrounding
View on GitHub
Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"
☆132Jul 5, 2021Updated 5 years ago
ruotianluo / refexp-comprehension
View on GitHub
Referring expression comprehension on ReferIt(RefClef)
☆10Nov 28, 2016Updated 9 years ago
sjenni / temporal-ssl
View on GitHub
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.
☆49Mar 18, 2021Updated 5 years ago
salesforce / ALPRO
View on GitHub
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆188May 1, 2025Updated last year
yuewang-cuhk / awesome-vision-language-pretraining-papers
View on GitHub
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
☆1,159Aug 19, 2022Updated 3 years ago
PeihaoChen / RSPNet
View on GitHub
Official Pytorch implementation for AAAI2021 paper (RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning)
☆37Nov 5, 2021Updated 4 years ago
moabitcoin / ig65m-pytorch
View on GitHub
PyTorch 3D video classification models pre-trained on 65 million Instagram videos
☆265Dec 7, 2019Updated 6 years ago