zhousheng97 / ViTXT-GQAView external linksLinks
[IEEE TMM'25] Scene-Text Grounding for Text-Based Video Question Answering
☆16Updated this week
Alternatives and similar repositories for ViTXT-GQA
Users that are interested in ViTXT-GQA are comparing it to the libraries listed below
Sorting:
- [ICLR 2024] Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement.☆15Mar 12, 2024Updated last year
- [NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching☆28May 29, 2025Updated 8 months ago
- Project website of TE141K.☆17Mar 24, 2020Updated 5 years ago
- [NeurIPS2021] BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting☆68Oct 9, 2023Updated 2 years ago
- HeadlessPivot☆29Jan 29, 2026Updated 2 weeks ago
- ☆16Updated this week
- Official Repository of RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning☆14Jul 9, 2025Updated 7 months ago
- [CVPR 2025] Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding☆15Jun 16, 2025Updated 8 months ago
- Implementation of various handwritten text line segmentation☆10Jan 6, 2020Updated 6 years ago
- Data Programming for Text Detection in Documents using SPEAR☆12Mar 26, 2025Updated 10 months ago
- Towards Video Text Visual Question Answering: Benchmark and Baseline☆40Feb 26, 2024Updated last year
- Client-side navigation done right☆11Dec 9, 2022Updated 3 years ago
- [WACV2025] source code of StrDA: https://arxiv.org/abs/2410.09913☆12Apr 15, 2025Updated 10 months ago
- Json patch serializer☆13Dec 12, 2020Updated 5 years ago
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated 10 months ago
- Used in M4C feature extraction script: https://github.com/facebookresearch/mmf/blob/project/m4c/projects/M4C/scripts/extract_ocr_frcn_fea…☆13Jan 30, 2020Updated 6 years ago
- Implementation of the DocLLM paper for Llama models.☆13Apr 6, 2025Updated 10 months ago
- STRExp is a framework that provides Explainability (XAI) to Scene Text Recognition (STR) models.☆11Nov 27, 2023Updated 2 years ago
- Easy interactive prompts to create and validate data using JSON schema.☆10Jan 24, 2026Updated 3 weeks ago
- [CVPR 2022] Accelerating Video Object Segmentation with Compressed Video☆42Jul 3, 2022Updated 3 years ago
- SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition☆10Apr 8, 2024Updated last year
- ☆12Oct 5, 2024Updated last year
- Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval (ICCV 2025 Highlight)☆20Aug 1, 2025Updated 6 months ago
- Quantization of LLMs and benchmarking.☆10Apr 3, 2024Updated last year
- Official implementation for AAAI 2025 paper: SSAN: A Symbol Spatial-Aware Network for Handwritten Mathematical Expression Recognition☆15Jan 21, 2025Updated last year
- [WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…☆12Feb 24, 2023Updated 2 years ago
- 【CVPR 2025】SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting☆16Jul 1, 2025Updated 7 months ago
- Awesome-GenAITech: a curated list of Generative AI Techniques☆11Jul 11, 2023Updated 2 years ago
- A curated list of resources on Document Layout Analysis☆11Aug 7, 2025Updated 6 months ago
- NLP Workshops☆11Apr 24, 2025Updated 9 months ago
- Official code for the paper: "Perception and Semantic Aware Regularization for Sequential Confidence Calibration (CVPR2023)"☆10May 15, 2024Updated last year
- FNIN: A Fourier Neural Operator-based Numerical Integration Network for Surface-form-gradients☆13Jan 22, 2025Updated last year
- Single-file, pure CUDA C implementation for running inference on Qwen3 0.6B GGUF. No Dependencies.☆22Nov 26, 2025Updated 2 months ago
- ☆10Mar 31, 2025Updated 10 months ago
- [ICPR-2024] S-MultiMAE - A Multi-Ground Truth approach for RGB-D Saliency Detection☆12Dec 13, 2024Updated last year
- Training PyTorch Faster-RCNN on custom dataset☆14Jun 2, 2021Updated 4 years ago
- Tools for formatting large language model prompts.☆13Dec 19, 2023Updated 2 years ago
- All coursework for the Learn Python Programming Masterclass by Tim Buchalka and Jean-Paul Roberts.☆12May 5, 2022Updated 3 years ago
- ☆16Nov 5, 2024Updated last year