Comprehensive benchmark for video text understanding
☆28Jun 4, 2025Updated 9 months ago
Alternatives and similar repositories for VidText
Users that are interested in VidText are comparing it to the libraries listed below
Sorting:
- 红外和可见光融合☆10Apr 17, 2019Updated 6 years ago
- Find strongest response of convolutional layers on an image dataset. Automatically compute receptive field for any CNN layer.☆14Feb 19, 2021Updated 5 years ago
- Implementing the paper☆15Nov 5, 2016Updated 9 years ago
- ☆11Jun 21, 2025Updated 8 months ago
- 【2024 ECAI】First Creating Backgrounds Then Rendering Texts: A New Paradigm for Visual Text Blending☆14Jun 16, 2025Updated 8 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs☆54Mar 9, 2025Updated 11 months ago
- ☆28Oct 17, 2025Updated 4 months ago
- Google《Introduction to Agents》中文翻译☆28Nov 14, 2025Updated 3 months ago
- ☆11Mar 25, 2020Updated 5 years ago
- ☆14Jun 1, 2023Updated 2 years ago
- ☆83May 20, 2025Updated 9 months ago
- Multi-focus image fusion using boosted random walks-based algorithm with two-scale focus maps☆12Jan 27, 2019Updated 7 years ago
- Add YOLOv3_tiny and data augment(clip, brighten, change saturation)☆14Jan 14, 2021Updated 5 years ago
- 🔥🔥[NeurIPS2025]Exploring and mitigating semantic hallucinations in scene text perception and reasoning☆26Dec 11, 2025Updated 2 months ago
- Video action classification benchmark for common CNN architectures, implemented in PyTorch☆11Jan 31, 2022Updated 4 years ago
- Web Photo Source Identification based on Neural Enhanced Camera Fingerprint (WWW2023)☆14Feb 25, 2023Updated 3 years ago
- qq browser multimodal video similarity contest☆17Mar 28, 2024Updated last year
- ☆26Jan 28, 2026Updated last month
- This repository provides core code for managing large volumes of video footage, enabling content understanding, automatic tagging, and ve…☆20Mar 25, 2025Updated 11 months ago
- 2019CCF爱奇艺视频拷贝(版权)检测算法☆15Dec 11, 2019Updated 6 years ago
- Video Similarity from Deep Neural Nets☆15Jul 3, 2020Updated 5 years ago
- (CVPR25) Exploring Contextual Attribute Density in Referring Expression Counting☆18Dec 3, 2025Updated 3 months ago
- An adaptive multispectral image fusion using particle swarm optimization☆15Dec 15, 2021Updated 4 years ago
- The top conferences on video retrieval libraries in recent years, synchronized with my blog.☆14Nov 27, 2021Updated 4 years ago
- This project explores the different techniques (both scalable and non scalable) for Graph based semi supervised learning. Recent techniqu…☆14May 28, 2016Updated 9 years ago
- ☆34Jun 9, 2025Updated 8 months ago
- [ECCV2024] ModTr: Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge☆19Nov 28, 2024Updated last year
- show raw10, raw16 image or convert it to jpeg/png.☆17Apr 2, 2021Updated 4 years ago
- Multiple-Person Multi-Camera Tracker☆13Feb 17, 2017Updated 9 years ago
- [IEEE TMM'25] Scene-Text Grounding for Text-Based Video Question Answering☆16Feb 16, 2026Updated 2 weeks ago
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆22Jan 26, 2026Updated last month
- (CVPR 2025 Highlight) Official repository of paper "AODRaw: Towards RAW Object Detection in Diverse Conditions" (https://arxiv.org/pdf/24…☆24Apr 6, 2025Updated 11 months ago
- Belief propagation is a general graph algorithm for ranking nodes based on their location in the graph and some prior knowledge. The ne…☆18Dec 9, 2014Updated 11 years ago
- Research work aimed at addressing the problem of modeling infinite-length context☆46Dec 18, 2025Updated 2 months ago
- WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning☆36Jun 10, 2025Updated 8 months ago
- This is a repo for the paper "Networking Systems for Video Anomaly Detection: A Tutorial and Survey". Paper: https://arxiv.org/abs/2405.1…☆32Mar 26, 2025Updated 11 months ago
- ☆19Jan 18, 2019Updated 7 years ago
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆28May 29, 2025Updated 9 months ago
- This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"☆26Aug 24, 2023Updated 2 years ago