VikParuchuri / texifyLinks
Math OCR model that outputs LaTeX and markdown
☆1,105Updated last year
Alternatives and similar repositories for texify
Users that are interested in texify are comparing it to the libraries listed below
Sorting:
- Formula recognition based on LaTeX-OCR and ONNXRuntime.☆378Updated last year
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆159Updated last year
- Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community☆653Updated this week
- Extract structured text from pdfs quickly☆661Updated 7 months ago
- TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability,…☆705Updated 5 months ago
- An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them…☆3,001Updated this week
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆452Updated 4 months ago
- Lightweight, performant, deep table extraction☆526Updated last month
- UniTable: Towards a Unified Table Foundation Model☆522Updated last year
- TF-ID: Table/Figure IDentifier for academic papers☆245Updated last year
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,855Updated 2 weeks ago
- Detect and extract tables to markdown and csv☆754Updated last year
- img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing☆850Updated 3 months ago
- LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.☆1,204Updated last week
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆1,981Updated 9 months ago
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,817Updated 10 months ago
- library supporting NLP and CV research on scientific papers☆788Updated last year
- The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.☆166Updated 9 months ago
- Large scale training of Latex formula recognition model, currently being organized and open source☆56Updated last year
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆306Updated 5 months ago
- 🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.☆611Updated 8 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆274Updated 2 months ago
- HTML to Markdown converter and crawler.☆610Updated 2 years ago
- Python bindings to PDFium, reasonably cross-platform.☆721Updated this week
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,848Updated last year
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆411Updated 3 years ago
- Parse LaTeX math expressions☆143Updated last year
- CLI for document conversion for scientific documents, powered by Mathpix OCR☆114Updated 2 years ago
- A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition☆195Updated 3 months ago
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆285Updated 4 months ago