lanfeng4659 / PSTRLinks

☆8

Alternatives and similar repositories for PSTR

Users that are interested in PSTR are comparing it to the libraries listed below

Sorting:

mxin262 / Bridging-Text-Spotting
(CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.
☆66Updated last year
mxin262 / ESTextSpotter
(ICCV 2023) ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
☆77Updated last year
SCUT-DLVCLab / OCR-Reasoning
[arXiv: 2505.17163] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning
☆60Updated last week
caipeng328 / ForCenNet
☆31Updated last week
retsuh-bqw / SRFormer-Text-Det
[AAAI 2024] SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression
☆66Updated 5 months ago
bytedance / E2STR
The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
☆53Updated last year
whai362 / pan_pp_stable
☆28Updated 2 years ago
xdxie / WordArt
The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.
☆145Updated 2 years ago
PriNing / ODM
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
☆41Updated 4 months ago
SCUT-DLVCLab / MegaHan97K
[PR 2025] The official GitHub page of "MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Ca…
☆62Updated 3 weeks ago
Mountchicken / Union14M
[ICCV 2023] Code base for Revisiting Scene Text Recognition: A Data Perspective
☆190Updated last year
wenwenyu / TCM
Turning a CLIP Model into a Scene Text Detector (CVPR2023) | Turning a CLIP Model into a Scene Text Spotter (TPAMI)
☆193Updated last year
weijiawu / TransDETR
[IJCV 2024] TransDETR: End-to-end Video Text Spotting with Transformer
☆104Updated last year
ThunderVVV / RCLSTR
Official PyTorch implementation of `[ACMMM 2023]Relational Contrastive Learning for Scene Text Recognition`
☆17Updated last year
shannanyinxiang / SPTS
Official implementation of SPTS: Single-Point Text Spotting (ACM MM 2022 Oral)
☆143Updated 2 years ago
bytedance / SPTSv2
The official implementation of SPTS v2: Single-Point Text Spotting
☆136Updated 2 years ago
fh2019ustc / DeepEraser
The official code for “DeepEraser: Deep Iterative Context Mining for Generic Text Eraser”, TMM, 2024.
☆42Updated 11 months ago
SCUT-DLVCLab / SCUT-EnsExam
SCUT-EnsExam is a real-world handwritten text erasure dataset for examination paper scenarios, which consists of 545 examination paper im…
☆13Updated last year
yufanchen96 / RoDLA
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
☆36Updated 4 months ago
Hxyz-123 / GoMatching
[NeurIPS'24] GoMatching: A Simple Baseline for Video Text Spotting via Long and Short Term Matching
☆26Updated 2 months ago
SCUT-DLVCLab / GPT-4V_OCR
Evaluation of the Optical Character Recognition (OCR) capabilities of GPT-4V(ision)
☆125Updated last year
Token-family / TokenFD
[ICCV2025] A Token-level Text Image Foundation Model for Document Understanding
☆111Updated last week
yyyyyxie / DNTextSpotter
[ACMMM 2024]: Official implementation of the paper "DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training"
☆30Updated 9 months ago
zzyhlyoko / DCTC
☆42Updated last year
lcy0604 / CTRNet
This repository is the implementation of "Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Contex…
☆87Updated 2 years ago
ymy-k / DPText-DETR
[AAAI'23 Oral] DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
☆191Updated last year
large-ocr-model / large-ocr-model.github.io
☆181Updated last year
yh-hust / PDF-Wukong
【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling
☆122Updated 2 months ago
Wei-ucas / TPSNet
☆26Updated last year
mlpc-ucsd / TESTR
(CVPR 2022) Text Spotting Transformers
☆187Updated 2 years ago