Comprehensive benchmark for video text understanding
☆28Jun 4, 2025Updated 11 months ago
Alternatives and similar repositories for VidText
Users that are interested in VidText are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official Code for TPAMI 2024 paper "EvHandPose: Event-based 3D Hand Pose Estimation with Sparse Supervision"☆18Dec 4, 2024Updated last year
- Optocal Character Recognition (OCR / HTR) using Transformers☆11Aug 20, 2022Updated 3 years ago
- ☆11Jun 21, 2025Updated 10 months ago
- ☆32Jan 28, 2026Updated 3 months ago
- EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [CVPR'24]☆32Jul 23, 2025Updated 9 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- This demo demonstrates the AI capabilities of the mcxn947. It displays the image captured by the camera on the LCD screen and performs fa…☆12Jul 21, 2025Updated 9 months ago
- ☆14Apr 25, 2025Updated last year
- Multiple-Person Multi-Camera Tracker☆13Feb 17, 2017Updated 9 years ago
- ☆13May 17, 2025Updated 11 months ago
- Add YOLOv3_tiny and data augment(clip, brighten, change saturation)☆14Jan 14, 2021Updated 5 years ago
- [ACCV 2024 (Oral, Best Application Paper)] Official Implementation of NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tra…☆16Dec 30, 2025Updated 4 months ago
- [TPAMI2025] BackMix: Regularizing Open Set Recognition by Removing Underlying Fore-Background Priors☆16Apr 23, 2025Updated last year
- [AAAI 2025]MMGDreamer: Mixed-Modality Graph for Geometry-Controllable 3D Indoor Scene Generation☆35Jul 26, 2025Updated 9 months ago
- An official Project related to Paper "Perceiving Ambiguity and Semantics without Recognition: An Efficient and Effective Ambiguous Scene …☆21Dec 3, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- (CVPR 2025 Highlight) Official repository of paper "AODRaw: Towards RAW Object Detection in Diverse Conditions" (https://arxiv.org/pdf/24…☆24Apr 6, 2025Updated last year
- [CVPR 2025] Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents☆32Jun 3, 2025Updated 11 months ago
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆31May 29, 2025Updated 11 months ago
- ☆25Jul 20, 2025Updated 9 months ago
- Rui Qian, Xin Lai, Xirong Li: BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022: IF=8.518)☆13Feb 12, 2026Updated 2 months ago
- ☆52Oct 20, 2025Updated 6 months ago
- ☆31Oct 17, 2025Updated 6 months ago
- VideoMathQA is a benchmark designed to evaluate mathematical reasoning in real-world educational videos☆23Jan 26, 2026Updated 3 months ago
- ☆93May 20, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆30Jul 12, 2023Updated 2 years ago
- Official implementation of CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images☆21Jun 24, 2024Updated last year
- ☆24Sep 12, 2024Updated last year
- An unnecessarily tiny and minimal implementation of GPT-2 in NumPy.☆11Feb 12, 2023Updated 3 years ago
- C++实现LeNet-5☆15Jul 16, 2018Updated 7 years ago
- [NeurIPS 2024 Oral] RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation☆19Dec 22, 2024Updated last year
- ☆33Sep 27, 2024Updated last year
- This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"☆26Aug 24, 2023Updated 2 years ago
- 基于TLD算法和GOTURN算法的多摄像头目标跟踪☆26Mar 22, 2020Updated 6 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆21Feb 29, 2024Updated 2 years ago
- 🔥🔥First-ever hour scale video understanding models☆621Jul 14, 2025Updated 9 months ago
- Find strongest response of convolutional layers on an image dataset. Automatically compute receptive field for any CNN layer.☆14Feb 19, 2021Updated 5 years ago
- MXNET实现的年龄性别识别,训练了超大数据集得到的模型。☆32Dec 24, 2024Updated last year
- Implementing the paper☆15Nov 5, 2016Updated 9 years ago
- DPS-Net: Deep Polarimetric Stereo Depth Estimation☆22Mar 22, 2024Updated 2 years ago
- 'Minimum Delay Object Detection From Video', ICCV 2019☆30Oct 3, 2019Updated 6 years ago