thinh-re / s-multimaeLinks
The official implementation of S-MultiMAE - A Multi-Ground Truth approach for RGB-D Saliency Detection
☆12Updated 5 months ago
Alternatives and similar repositories for s-multimae
Users that are interested in s-multimae are comparing it to the libraries listed below
Sorting:
- Vietnamese handwritten text recognition system☆17Updated 4 years ago
- Create TensorRT-runtime for vietocr☆12Updated 3 years ago
- Dictionary-guided Scene Text Recognition (CVPR-2021)☆149Updated 10 months ago
- Official PyTorch implementation for Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features …☆69Updated last year
- a dataset for camera-based table detection☆16Updated 3 years ago
- ☆78Updated last year
- Official implementation for Dessurt: Document end-to-end self-supervised understanding and recognition transformer☆60Updated 2 years ago
- [AAAI 2024] SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression☆66Updated 3 months ago
- A curated list of papers about key information extraction.☆96Updated 5 months ago
- ☆88Updated 3 months ago
- It's the code for the paper Pushing the Performance Limit of Scene Text Recognizer without Human Annotation, CVPR 2022.☆28Updated 2 years ago
- CRAFT(Baek et al., 2019) model training code☆46Updated 9 months ago
- swin-transformer custom for OCR☆114Updated last year
- Scene text recognition☆106Updated 2 years ago
- ☆43Updated 2 years ago
- Official implementation for "GLASS: Global to Local Attention for Scene-Text Spotting" (ECCV'22)☆101Updated 11 months ago
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆25Updated last year
- ☆40Updated 10 months ago
- DocILE: Document Information Localization and Extraction Benchmark☆129Updated last year
- [arXiv: 2505.17163] OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning☆40Updated last week
- [MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.☆33Updated 2 months ago
- Multimodal Semi-Supervised Learning for Text Recognition (SemiMTR)☆83Updated last year
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆73Updated 8 months ago
- ☆38Updated last year
- The official implementation of SPTS v2: Single-Point Text Spotting☆135Updated last year
- This is the official repository for Vista dataset - A Vietnamese multimodal dataset contains more than 700,000 samples of conversations a…☆26Updated last year
- (CVPR 2024) Bridging the Gap Between End-to-End and Two-Step Text Spotting.☆64Updated 11 months ago
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆16Updated last month
- OCR Annotations from Amazon Textract for Industry Documents Library☆102Updated 2 years ago
- ☆18Updated 2 years ago