Code for Learning to Learn Language from Narrated Video
☆33Oct 3, 2023Updated 2 years ago
Alternatives and similar repositories for expert
Users that are interested in expert are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Video action classification benchmark for common CNN architectures, implemented in PyTorch☆11Jan 31, 2022Updated 4 years ago
- Code for the Globetrotter project☆23Mar 17, 2022Updated 4 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 6 years ago
- Code for the paper Learning the Predictability of the Future (CVPR 2021)☆173Jul 31, 2023Updated 2 years ago
- When can you tell whether an image has been cropped or not?☆29Sep 19, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆37Mar 10, 2026Updated 2 months ago
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 4 years ago
- Collection of useful FFMPEG commands for processing audio and video files.☆44Jan 29, 2019Updated 7 years ago
- CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training☆34Nov 9, 2021Updated 4 years ago
- Inferring and Executing Programs for Visual Reasoning☆21Jan 4, 2019Updated 7 years ago
- Shapley values for assessing the importance of each frame in a video☆17Mar 1, 2021Updated 5 years ago
- Support library for the MaskRCNN masks extracted on EPIC-KITCHENS-100☆14Dec 1, 2020Updated 5 years ago
- Website-based resource monitor for Slurm system☆38Apr 6, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ACL 2019] Visually Grounded Neural Syntax Acquisition☆90Feb 24, 2024Updated 2 years ago
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Sep 5, 2022Updated 3 years ago
- RareAct: A video dataset of unusual interactions☆34Aug 4, 2020Updated 5 years ago
- Code for the paper Real-Time Neural Voice Camouflage☆28Apr 13, 2022Updated 4 years ago
- Localize objects in images using referring expressions☆37Nov 1, 2016Updated 9 years ago
- Source code for "Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction"☆48Jun 22, 2024Updated last year
- Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)☆16Mar 17, 2025Updated last year
- Self-supervised learning through the eyes of a child☆145Jul 20, 2021Updated 4 years ago
- WildVSR☆22Dec 13, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆167Apr 29, 2021Updated 5 years ago
- PyTorch 3D video classification models pre-trained on 65 million Instagram videos☆265Dec 7, 2019Updated 6 years ago
- Video Representation Learning by Dense Predictive Coding. Tengda Han, Weidi Xie, Andrew Zisserman.☆256Oct 8, 2021Updated 4 years ago
- Website for TextVQA dataset.☆29Apr 30, 2023Updated 3 years ago
- ☆64Jan 5, 2022Updated 4 years ago
- [ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset☆91Sep 6, 2023Updated 2 years ago
- yaspi - Yet Another Slurm Python Interface☆48Apr 25, 2022Updated 4 years ago
- ☆96Feb 14, 2022Updated 4 years ago
- ☆11Feb 9, 2026Updated 3 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated last year
- Starter Code for VALUE benchmark☆79Aug 23, 2022Updated 3 years ago
- Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)☆56Jan 29, 2024Updated 2 years ago
- Code for Oops! Predicting Unintentional Action in Video☆80Apr 13, 2020Updated 6 years ago
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆28Jan 21, 2026Updated 4 months ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- The second version of the interface for Abstract Scenes research project.☆23May 16, 2022Updated 4 years ago