[ECCV 2024] The first zero-shot setting for spatio-temporal video grounding.
☆11Jul 16, 2024Updated last year
Alternatives and similar repositories for E3M
Users that are interested in E3M are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆11Dec 6, 2024Updated last year
- ☆20Apr 2, 2024Updated last year
- [ICLR 2025] Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding☆40Mar 18, 2025Updated last year
- VLG-Net: Video-Language Graph Matching Networks for Video Grounding☆31May 31, 2022Updated 3 years ago
- ☆12Jul 4, 2024Updated last year
- Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)☆55Sep 7, 2023Updated 2 years ago
- ☆10May 18, 2024Updated last year
- ☆12Aug 25, 2023Updated 2 years ago
- PyTorch implementation of the original evidental-deep-learning@https://github.com/aamini/evidential-deep-learning/☆13Sep 20, 2021Updated 4 years ago
- ☆17Dec 25, 2023Updated 2 years ago
- [CVPR 2024] Context-Guided Spatio-Temporal Video Grounding☆66Jun 28, 2024Updated last year
- ☆14Dec 25, 2024Updated last year
- [ACL 2025] Official code for ''Learning to Reason from Feedback at Test-Time''.☆13May 16, 2025Updated 10 months ago
- [AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding☆126Dec 10, 2024Updated last year
- ☆13Aug 7, 2017Updated 8 years ago
- The first unofficial implementation of CLIP4Caption: CLIP for Video Caption (ACMMM 2021)☆15Jan 2, 2023Updated 3 years ago
- [CVPR'24] Code for Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models☆18Jul 22, 2024Updated last year
- Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos☆28Jun 24, 2024Updated last year
- [ICCV 2025] AdsQA: Towards Advertisement Video Understanding Arxiv: https://arxiv.org/abs/2509.08621☆34Oct 30, 2025Updated 4 months ago
- A neural text style transfer model☆12Jun 23, 2019Updated 6 years ago
- ☆11Jan 24, 2024Updated 2 years ago
- [2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding☆31Aug 5, 2023Updated 2 years ago
- Collection of papers about video-audio understanding☆24Dec 26, 2025Updated 2 months ago
- A curated list of awesome Unlearnable Example papers resources.☆13Dec 14, 2025Updated 3 months ago
- Segment Anything with Deictic Prompting☆27May 13, 2025Updated 10 months ago
- Multi-modal data augmentation for machine learning☆16Jun 4, 2019Updated 6 years ago
- Rename Mac screenshots based on its contents with local Ollama or ChatGPT☆20Dec 3, 2024Updated last year
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Sep 15, 2021Updated 4 years ago
- ☆22May 3, 2025Updated 10 months ago
- Repository for "Uncertainty Regularized Evidential Regression" published in AAAI 2024☆21Jun 30, 2024Updated last year
- [ACL 2025] "World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning." https://arxiv.org/abs/2503.1…☆17Jul 22, 2025Updated 8 months ago
- PyTorch re-implementation of Hierarchical Normalization for Robust Monocular Depth Estimation☆21Dec 8, 2022Updated 3 years ago
- ☆22Jan 26, 2024Updated 2 years ago
- (CVPR 2026) Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation☆30Feb 28, 2026Updated 3 weeks ago
- Adding scripts and dataset related to the acceptance of the paper for WACV 2025.☆13Mar 20, 2025Updated last year
- PyTorch implementation for DESC - BMVC20 (Oral) & IJCV22☆17Dec 23, 2022Updated 3 years ago
- ☆10Feb 21, 2023Updated 3 years ago
- Large Visual Language Model(LVLM), Large Language Model(LLM), Multimodal Large Language Model(MLLM), Alignment, Agent, AI System, Survey☆21Jul 27, 2025Updated 7 months ago
- The source for pimbook.org☆18Feb 24, 2023Updated 3 years ago