onealwj / MVLT
PyTorch implementation of BMVC2022 paper Masked Vision-Language Transformers for Scene Text Recognition
☆29Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for MVLT
- ☆13Updated last year
- ☆16Updated 2 years ago
- STIRER: A Unified Model for Low-Resolution Scene Text Image Recovery and Recognition -- ACMMM 2023☆12Updated 10 months ago
- ☆25Updated 11 months ago
- ☆25Updated 5 months ago
- ☆22Updated 2 years ago
- The dataset used in the CVPR 2022 paper (SimAN: Exploring Self-Supervised Representation Learning of Scene Text via Similarity-Aware Norm…☆33Updated 2 years ago
- HHH☆33Updated 2 years ago
- Searching a High Performance Feature Extractor for Text Recognition Network. TPAMI 2022☆13Updated last year
- [NeurIPS2021] BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting☆67Updated last year
- ☆36Updated 4 months ago
- ☆14Updated 11 months ago
- [ACM MM 2020] Exploring Font-independent Features for Scene Text Recognition☆44Updated 3 years ago
- It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.☆28Updated 2 years ago
- Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`☆17Updated last year
- [MM2023] An official implement of the paper "One-stage Low-resolution Text Recognition with High-resolution Knowledge Transfer"☆15Updated last year
- ☆22Updated 9 months ago
- ☆39Updated 4 years ago
- ☆35Updated last year
- Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training☆33Updated last year
- Geometric Augmentation for Text Image☆9Updated 4 years ago
- TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers☆21Updated 2 years ago
- Vision-Language Pre-Training for Boosting Scene Text Detectors (CVPR2022)☆12Updated 2 years ago
- Official PyTorch implementation of "CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection"☆15Updated 7 months ago
- WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia☆25Updated last year
- ☆22Updated last year
- ☆10Updated last year
- ☆41Updated last year
- The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer☆45Updated 5 months ago