A Survey on video and language understanding.
☆50Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for Awesome-Video-Language-Understanding
Users that are interested in Awesome-Video-Language-Understanding are comparing it to the libraries listed below
Sorting:
- ☆14Sep 22, 2020Updated 5 years ago
- [ICML 2024] RoboMP2: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models☆12Jun 30, 2025Updated 8 months ago
- 인명 구조용 드론을 위한 음성 화자 인지 기술 데이터셋☆24Jan 2, 2023Updated 3 years ago
- CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction☆18Oct 20, 2025Updated 4 months ago
- Sound Source Localization for PCM-A10 Microphone☆24Jan 16, 2023Updated 3 years ago
- ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model☆16Jan 31, 2024Updated 2 years ago
- (Unofficial) Implementation of ICLR 2021 paper "Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multil…☆14Sep 14, 2022Updated 3 years ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24May 24, 2022Updated 3 years ago
- [WIP@Oct 13] 质衡-基准测试 (Q-Bench in Chinese),包含中文版【底层视觉问答】和【底层视觉描述】数据集,以及中文提示下的图片质量评价。 We will release Q-Bench in more languages in the futu…☆24Jan 7, 2024Updated 2 years ago
- Generate audio datasets for training Text-To-Speech models, through smart audio splitting with silence detection, and transcription using…☆30May 27, 2023Updated 2 years ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆27Jan 17, 2026Updated last month
- ☆24Oct 8, 2023Updated 2 years ago
- SRD: A Tree Structure Based Decoder for Online Handwritten Mathematical Expression Recognition☆21Jul 20, 2020Updated 5 years ago
- ☆24Sep 29, 2021Updated 4 years ago
- Pytorch-tensorboard simple tutorial and example for a beginner☆23Aug 9, 2020Updated 5 years ago
- Official implementation of the ECCV 2022 Oral paper: Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments☆35Dec 16, 2023Updated 2 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- Official Implementation of IVLN-CE: Iterative Vision-and-Language Navigation in Continuous Environments☆35Dec 16, 2023Updated 2 years ago
- A curated list of resources about long-context in large-language models and video understanding.☆32Aug 8, 2023Updated 2 years ago
- Experimenting with GANs in Tensorflow/Keras☆10Jan 13, 2022Updated 4 years ago
- Code for reproducing "FMixCutMatch for Semi-supervised Deep Learning"☆12Nov 15, 2020Updated 5 years ago
- we explores the fascinating domain of text-to-image generation using the powerful capabilities of the Flux API. The objective is to trans…☆12Aug 14, 2024Updated last year
- PERT: A Progressively Region-based Network for Scene Text Removal (TIP2023)☆37Aug 11, 2023Updated 2 years ago
- An AI-powered tool that translates plain English commands into multi-step API workflows, automating the entire testing process.☆17Jul 27, 2025Updated 7 months ago
- [CVPR 2025] Official PyTorch implementation of MaskSub "Masking meets Supervision: A Strong Learning Alliance"☆45Mar 25, 2025Updated 11 months ago
- A responsive & browser compatible video player☆53Apr 18, 2017Updated 8 years ago
- Self hosted AI workflow for scraping Instagram Reels (audio and description). Extracting, summarising and categorising, then storing all …☆28Jan 10, 2026Updated last month
- An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis (CVPR'21)☆47Feb 13, 2023Updated 3 years ago
- ☆12Nov 30, 2022Updated 3 years ago
- A repository to organize materials from the AI4LAM Teach and Learning Working Group☆14May 5, 2023Updated 2 years ago
- Colab notebooks for d2l-book☆11Dec 5, 2019Updated 6 years ago
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- ☆16Apr 28, 2023Updated 2 years ago
- AI Powered Writing Tool☆11Mar 7, 2024Updated last year
- Deep Semi-Supervised Learning with Holistic methods for audio classification.☆11Dec 14, 2024Updated last year
- Code for ACL22 short Paper "Hierarchical Curriculum Learning for AMR Parsing"☆13Jun 1, 2022Updated 3 years ago
- ☆11Apr 3, 2023Updated 2 years ago
- Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset☆13Nov 19, 2022Updated 3 years ago