NormXU / nougat-latex-ocr
Codebase for fine-tuning / evaluating nougat-based image2latex generation models
☆140Updated 5 months ago
Alternatives and similar repositories for nougat-latex-ocr:
Users that are interested in nougat-latex-ocr are comparing it to the libraries listed below
- Formula recognition based on LaTeX-OCR and ONNXRuntime.☆333Updated 4 months ago
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆274Updated 2 months ago
- TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability,…☆458Updated this week
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆175Updated 9 months ago
- Chinese Mathematical Formula Detection (MFD) Dataset 中文文档数学公式检测数据集☆33Updated 2 years ago
- LaTeX OCR 的数据仓库☆112Updated 8 months ago
- Large scale training of Latex formula recognition model, currently being organized and open source☆48Updated 10 months ago
- Object Detection Model for Scanned Documents☆88Updated last year
- YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis☆85Updated last month
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆211Updated 2 months ago
- A large scale camera-taken table detection and recognition dataset.☆118Updated last year
- official code for "Fox: Focus Anywhere for Fine-grained Multi-page Document Understanding"☆140Updated 9 months ago
- This repo is used to release the ArxivFormula dataset.☆24Updated 3 months ago
- Another LaTex equation OCR tool based on ConvNeXt and Transformer☆48Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆36Updated last year
- ICDAR 2024 Table OCR Model☆29Updated 2 months ago
- ☆116Updated last year
- A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition☆133Updated 3 weeks ago
- My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"☆72Updated last month
- ☆78Updated 2 months ago
- 阅读顺序、Layoutreader☆12Updated 9 months ago
- 研究GOT-OCR-项目落地加速,不限语言☆59Updated 4 months ago
- 基于TrOCR + UniMER-1M数据集,训练一个小而美的公式识别模型☆19Updated 3 months ago
- transformer based OCR framework used to train OCR or image to latex☆9Updated 2 years ago
- Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.☆177Updated this week
- Scanning Single Shot Detector for Math in Document Images☆129Updated last year
- Official implementation for ICDAR 2021 best poster paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Tr…☆125Updated last year
- Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files☆135Updated last year
- ☆53Updated 8 months ago
- Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition☆27Updated last year