raphael-baena / DTLR
Handwritten Text Recognition and Character Detection
☆141Updated last week
Alternatives and similar repositories for DTLR:
Users that are interested in DTLR are comparing it to the libraries listed below
- [AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents☆69Updated 2 weeks ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence☆142Updated 6 months ago
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆188Updated 5 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆141Updated 10 months ago
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 5 months ago
- VimTS: A Unified Video and Image Text Spotter☆77Updated 4 months ago
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆77Updated 2 weeks ago
- A gradio webui for Andrewyng translation-agent☆29Updated 3 months ago
- [AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization☆205Updated last week
- ☆120Updated last year
- Vary-tiny codebase upon LAVIS (for training from scratch)and a PDF image-text pairs data (about 600k including English/Chinese)☆79Updated 6 months ago
- Analysis of Chinese and English layouts 中英文版面分析☆185Updated last week
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆83Updated 4 months ago
- The Learnable Typewriter: A Generative Approach to Text Line Analysis☆31Updated 5 months ago
- ☆25Updated last month
- [ECCV2024] PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer☆69Updated 6 months ago
- ☆172Updated last month
- Official implementation for ICDAR 2024 Oral paper "ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expressi…☆26Updated 7 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆118Updated 4 months ago
- 💡 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning☆37Updated this week
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆201Updated 10 months ago
- A Token-level Text Image Foundation Model for Document Understanding☆78Updated last week
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆116Updated 4 months ago
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆126Updated 4 months ago
- [IEEE TPAMI] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation☆255Updated last week
- The official code of CornerTransformer (ECCV 2022, Oral) on top of MMOCR.☆140Updated 2 years ago
- Document Artifical Intelligence☆157Updated 3 months ago
- [TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild☆35Updated last year
- 基于序列表格识别算法推理库,集成PP-Structure和modelscope等表格识别算法。☆252Updated 2 months ago
- ☆26Updated 5 months ago