Bhashini-IITJ / visualTranslationLinks
Implementation of Baseline for Scene Text-to-Scene Text Translation
☆18Updated 10 months ago
Alternatives and similar repositories for visualTranslation
Users that are interested in visualTranslation are comparing it to the libraries listed below
Sorting:
- [ICDAR 2023] (Oral) An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation☆76Updated last year
- This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…☆30Updated 2 weeks ago
- ☆16Updated last year
- (Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”☆118Updated last week
- Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023☆28Updated 2 years ago
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆82Updated 3 years ago
- Official repository accompaying the ICDAR 2023 paper☆13Updated 2 years ago
- ☆87Updated 10 months ago
- This repository is the code of our paper "DiffUTE: Universal Text Editing Diffusion Model" (NeurIPS'2023).☆144Updated 9 months ago
- Cross-lingual learning in scene text recognition (ICASSP2024)☆18Updated last year
- ☆27Updated last year
- Official implementation of Generative Colorization of Structured Mobile Web Pages, WACV 2023.☆22Updated 2 years ago
- Official PyTorch Implementation of "WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models" - ICDAR 2023☆82Updated last year
- ☆17Updated last year
- ☆44Updated last year
- SOTA Document Image Enhancement - T2T-BinFormer: Effective Document Image Enhancement Using tokens-to-token Transformer Network☆24Updated 2 years ago
- ☆101Updated last year
- Official PyTorch Implementation of "DiffusionPen: Towards Controlling the Style of Handwritten Text Generation" - ECCV 2024☆83Updated last year
- Code for CVPR21 paper A Multiplexed Network for End-to-End, Multilingual OCR☆80Updated 3 years ago
- Basic HTR concepts/modules to boost performance☆37Updated last year
- Code for ACL 2024 findings paper "wav2vec-S: Adapting Pre-trained Speech Models for Streaming"☆10Updated 9 months ago
- ☆14Updated 2 years ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆23Updated last month
- ☆24Updated last year
- Official implementation of URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding (AAAI 2026…☆34Updated 2 months ago
- [ICCV 2023] ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules☆28Updated last year
- ☆10Updated 2 years ago
- A Versatile Face Encoder for Zero-Shot Diffusion Model Personalization☆24Updated 6 months ago
- Official implementation of UPOCR: Towards unified pixel-level OCR interface (ICML 2024)☆65Updated last year
- Official implementation for ICDAR 2024 Oral paper "ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expressi…☆28Updated last year