raphael-baena / DTLR
Handwritten Text Recognition and Character Detection
☆130Updated 4 months ago
Alternatives and similar repositories for DTLR:
Users that are interested in DTLR are comparing it to the libraries listed below
- [AAAI2025 Oral] Predicting the Original Appearance of Damaged Historical Documents☆63Updated 2 months ago
- [NAACL 2024] Visually Guided Generative Text-Layout Pre-training for Document Intelligence☆141Updated 6 months ago
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆139Updated 9 months ago
- [AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization☆201Updated 3 weeks ago
- 📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。☆185Updated 4 months ago
- 研究GOT-OCR-项目落地加速,不限 语言☆59Updated 4 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆76Updated 4 months ago
- 修正文档扭曲/模糊/阴影等情况,使用onnx模型简单轻量部署,未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We wi…☆41Updated 2 months ago
- ☆172Updated last month
- VimTS: A Unified Video and Image Text Spotter☆76Updated 4 months ago
- GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models☆67Updated 8 months ago
- Analysis of Chinese and English layouts 中英文版面分析☆177Updated 2 weeks ago
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆113Updated 3 months ago
- 《高军 AI 日报》: 每天花 1 分钟时间,获取精选的前沿 AI 信息。内容涵盖但不限于 前沿 AI 资讯、AI 工具、AI 绘画、开源项目和学习教程 等等。☆44Updated 3 months ago
- Implementation of the table detection and table structure recognition deep learning model described in the paper "ClusterTabNet: Supervis…☆11Updated this week
- ☆80Updated 2 months ago
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆126Updated 3 months ago
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆279Updated this week
- ☆60Updated 3 months ago
- Document Artifical Intelligence☆154Updated 3 months ago
- The project page of Diffutoon☆26Updated last year
- 【ICDAR 2024】Coarse-to-Fine Document Image Registration for Dewarping☆17Updated 7 months ago
- A gradio webui for Andrewyng translation-agent☆28Updated 3 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆217Updated 2 weeks ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆117Updated 4 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆22Updated 8 months ago
- 【ArXiv】PDF-Wukong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling☆113Updated 4 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆178Updated 9 months ago
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆90Updated this week