A Survey on video and language understanding.
☆50Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for Awesome-Video-Language-Understanding
Users that are interested in Awesome-Video-Language-Understanding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025 Spotlight] This is the official repository for our paper: ''Enhancing Pre-trained Representation Classifiability can Boost its…☆25Apr 26, 2025Updated 11 months ago
- 인명 구조용 드론을 위한 음성 화자 인지 기술 데이터셋☆24Jan 2, 2023Updated 3 years ago
- Sound Source Localization for PCM-A10 Microphone☆24Jan 16, 2023Updated 3 years ago
- Code for NeurIPS 2021 paper "Curriculum Learning for Vision-and-Language Navigation"☆15Dec 13, 2022Updated 3 years ago
- [ICML 2024] Official repository of ICML 2024 - RoboMP2: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language…☆11Apr 4, 2026Updated last week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆14Sep 22, 2020Updated 5 years ago
- ☆27Jan 31, 2023Updated 3 years ago
- Code and Dataset for Learning to Solve Complex Tasks by Talking to Agents☆24May 24, 2022Updated 3 years ago
- Firmware for the EcoSteno stenographer keyboard☆12Feb 17, 2023Updated 3 years ago
- An experiment with movie scenes and contrastive learning☆11Feb 1, 2025Updated last year
- Sound Source Localization for PCM-A10 Microphone☆33Jan 31, 2023Updated 3 years ago
- [EMNLP 2024] TraveLER: A Modular Multi-LMM Agent Framework for Video Question-Answering☆17Oct 31, 2024Updated last year
- (Unofficial) Implementation of ICLR 2021 paper "Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multil…☆14Sep 14, 2022Updated 3 years ago
- Training code of waypoint predictor in Discrete-to-Continuous VLN.☆29Mar 25, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding☆49Jan 9, 2024Updated 2 years ago
- [ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"☆136May 5, 2023Updated 2 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- Density Constrained Reinforcement Learning☆12Mar 24, 2023Updated 3 years ago
- [NeurIPS 2024] This is the official repository for our paper: ''Expanding Sparse Tuning for Low Memory Usage''.☆23Nov 8, 2025Updated 5 months ago
- SRD: A Tree Structure Based Decoder for Online Handwritten Mathematical Expression Recognition☆21Jul 20, 2020Updated 5 years ago
- Official implementation of the ECCV 2022 Oral paper: Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments☆34Dec 16, 2023Updated 2 years ago
- Awesome world models for manipulation☆55Sep 19, 2024Updated last year
- [CVPRW'23 Best Paper Award] Zero-shot Unsupervised Transfer Instance Segmentation☆24Aug 22, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- Official Implementation of IVLN-CE: Iterative Vision-and-Language Navigation in Continuous Environments☆36Dec 16, 2023Updated 2 years ago
- ☆13Aug 19, 2024Updated last year
- VoiceBank-2023 is the speech corpus specially designed for constructing personalized Mandarin text-to-speech (TTS) systems.☆41Jan 4, 2026Updated 3 months ago
- ☆24Dec 11, 2024Updated last year
- [ACL 2023] Code and data for our paper "Measuring Progress in Fine-grained Vision-and-Language Understanding"☆13Jun 11, 2023Updated 2 years ago
- 인명 구조용 드론을 위한 음성 화자 인지 기술☆23Jan 10, 2023Updated 3 years ago
- ☆18Apr 4, 2025Updated last year
- ☆16Jan 6, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [ICLR 2023] CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆46Jun 9, 2025Updated 10 months ago
- Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 4 months ago
- Implementation prototype of the Deep Deterministic Off-Policy Gradient (DD-OPG) method.☆11Jun 12, 2019Updated 6 years ago
- [EMNLP 2025] Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations☆44Jan 14, 2026Updated 3 months ago
- ☆23Sep 29, 2021Updated 4 years ago
- To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …☆11Jun 18, 2024Updated last year
- Project page for the 'CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection', ECC…☆12May 29, 2021Updated 4 years ago