[PR 2024] A large Cross-Modal Video Retrieval Dataset with Reading Comprehension
☆31Dec 28, 2023Updated 2 years ago
Alternatives and similar repositories for TextVR
Users that are interested in TextVR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Feb 26, 2025Updated last year
- Code-Implementation-of-Super-Resolution-ZOO (image & video)☆10Jul 6, 2020Updated 5 years ago
- [ICCV 2023] Generative Prompt Model for Weakly Supervised Object Localization☆57Nov 10, 2023Updated 2 years ago
- Towards Video Text Visual Question Answering: Benchmark and Baseline☆41Feb 26, 2024Updated 2 years ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICME 2023] FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation☆13May 13, 2023Updated 3 years ago
- Edit and Generate Anything in 3D world!☆13Apr 15, 2023Updated 3 years ago
- Composed Video Retrieval☆62May 2, 2024Updated 2 years ago
- [IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer☆114Mar 28, 2024Updated 2 years ago
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆26Aug 1, 2025Updated 10 months ago
- We are very happy that our work has been accepted by ACM Multimedia 2024!🥰☆12Jan 8, 2025Updated last year
- [ECCV 2024] ControlCap: Controllable Region-level Captioning☆81Oct 25, 2024Updated last year
- Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"☆95Mar 9, 2025Updated last year
- ☆12Mar 31, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- FishCam: A low-cost open source autonomous camera for aquatic research