Code for Learning to Learn Language from Narrated Video
☆33Oct 3, 2023Updated 2 years ago
Alternatives and similar repositories for expert
Users that are interested in expert are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Video action classification benchmark for common CNN architectures, implemented in PyTorch☆11Jan 31, 2022Updated 4 years ago
- Code for the Globetrotter project☆23Mar 17, 2022Updated 4 years ago
- Project page for "Visual Grounding in Video for Unsupervised Word Translation" CVPR 2020☆43Apr 26, 2020Updated 5 years ago
- Code for the paper Learning the Predictability of the Future (CVPR 2021)☆173Jul 31, 2023Updated 2 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆34Mar 10, 2026Updated last month
- The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding☆64Mar 9, 2022Updated 4 years ago
- Collection of useful FFMPEG commands for processing audio and video files.☆44Jan 29, 2019Updated 7 years ago
- Inferring and Executing Programs for Visual Reasoning☆21Jan 4, 2019Updated 7 years ago
- A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.☆41Jun 29, 2022Updated 3 years ago
- Shapley values for assessing the importance of each frame in a video☆17Mar 1, 2021Updated 5 years ago
- Support library for the MaskRCNN masks extracted on EPIC-KITCHENS-100☆14Dec 1, 2020Updated 5 years ago
- Website-based resource monitor for Slurm system☆38Apr 6, 2023Updated 3 years ago
- [ACL 2019] Visually Grounded Neural Syntax Acquisition☆90Feb 24, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- The 1st place solution of 2022 Ego4d Natural Language Queries.☆32Sep 5, 2022Updated 3 years ago
- RareAct: A video dataset of unusual interactions☆34Aug 4, 2020Updated 5 years ago
- [CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.☆119Oct 9, 2023Updated 2 years ago
- Source code for "Weakly-Supervised Video Object Grounding from Text by Loss Weighting and Object Interaction"☆48Jun 22, 2024Updated last year
- PyTorch GPU distributed training code for MIL-NCE HowTo100M☆219Jul 5, 2022Updated 3 years ago
- Visual Speech Recognition For Low-Resource Languages with Automatic Labels (ICASSP 2024)☆16Mar 17, 2025Updated last year
- Self-supervised learning through the eyes of a child☆146Jul 20, 2021Updated 4 years ago
- WildVSR☆21Dec 13, 2023Updated 2 years ago
- [ECCV'20 Spotlight] Memory-augmented Dense Predictive Coding for Video Representation Learning. Tengda Han, Weidi Xie, Andrew Zisserman.☆167Apr 29, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- PyTorch 3D video classification models pre-trained on 65 million Instagram videos☆265Dec 7, 2019Updated 6 years ago
- ☆20Apr 18, 2024Updated 2 years ago
- ☆64Jan 5, 2022Updated 4 years ago
- A repo for processing the raw hand object detections to produce releasable pickles + library for using these☆41Oct 26, 2024Updated last year
- yaspi - Yet Another Slurm Python Interface☆48Apr 25, 2022Updated 3 years ago
- 🍴 Annotations for the EPIC KITCHENS-55 Dataset.☆155Mar 17, 2021Updated 5 years ago
- ☆96Feb 14, 2022Updated 4 years ago
- ☆11Feb 9, 2026Updated 2 months ago
- Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation (ACM MM 2024)☆20Mar 17, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Starter Code for VALUE benchmark☆80Aug 23, 2022Updated 3 years ago
- Code for Oops! Predicting Unintentional Action in Video☆80Apr 13, 2020Updated 6 years ago
- [ICLR2026] The code for "Interp3D: Correspondence-Aware Interpolation for Generative Textured 3D Morphing."☆26Jan 21, 2026Updated 2 months ago
- Code for "Counterfactual Variable Control for Robust and Interpretable Question Answering"☆14Oct 13, 2020Updated 5 years ago
- The second version of the interface for Abstract Scenes research project.☆23May 16, 2022Updated 3 years ago
- neon implementation of SegNet☆13Jan 3, 2023Updated 3 years ago
- ☆28Jul 1, 2020Updated 5 years ago