gsoykan / comics_text_plus
Official repository of the paper: "A Comprehensive Gold Standard and Benchmark for Comics Text Detection and Recognition"
☆24Updated last year
Related projects ⓘ
Alternatives and complementary repositories for comics_text_plus
- Cross-lingual learning in scene text recognition (ICASSP2024)☆15Updated last month
- COO: Comic onomatopoeia dataset (ECCV 2022)☆70Updated last year
- Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”☆17Updated last year
- ☆11Updated 6 months ago
- An implementation of Tiling and Corruption (TACo) Augmentations for OCR/HTR☆15Updated 2 years ago
- ☆38Updated last year
- A dashboard for exploring timm learning rate schedulers☆18Updated last year
- Official repository accompaying the ICDAR 2023 paper☆10Updated last year
- ☆18Updated last year
- Example code for prefix-tuning GPT/GPT-NeoX models and for inference with trained prefixes☆12Updated last year
- Datasets and Evaluation Scripts for CompHRDoc☆25Updated 7 months ago
- CTE: Contextualized Table Extraction Dataset☆17Updated last year
- This repository contains source codes for SoftCTC. Original paper can be found here: https://arxiv.org/abs/2212.02135☆19Updated last year
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Updated last year
- ☆22Updated 9 months ago
- Visualize multi-model embedding spaces. The first goal is to quickly get a lay of the land of any embedding space. Then be able to scroll…☆26Updated 6 months ago
- The largest VQA dataset for Vietnamese. Related to the text content in the image.☆16Updated 6 months ago
- OCR-VQGAN, a discrete image encoder (tokenizer and detokenizer) for figure images in Paper2Fig100k dataset. Implementation of OCR Percept…☆73Updated last year
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.☆20Updated 5 months ago
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆30Updated 2 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- codebase for the SIMAT dataset and evaluation☆38Updated 2 years ago
- ☆44Updated 3 years ago
- High-Performance Transformers for Table Structure Recognition Need Early Convolutions☆41Updated 7 months ago
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆45Updated last month
- Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012☆49Updated 2 years ago
- PyTorch implementation of STR models for transfer learning in Indic Languages☆16Updated 3 years ago
- Source code for the paper "Post-OCR Document Correction with Large Ensembles of Character Sequence-to-Sequence Models"☆35Updated 11 months ago
- **ARCHIVED** Filesystem interface to 🤗 Hub☆56Updated last year