VikParuchuri / texify
Math OCR model that outputs LaTeX and markdown
☆1,029Updated last month
Alternatives and similar repositories for texify:
Users that are interested in texify are comparing it to the libraries listed below
- Formula recognition based on LaTeX-OCR and ONNXRuntime.☆333Updated 4 months ago
- Extract structured text from pdfs quickly☆427Updated this week
- TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability,…☆458Updated this week
- Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community☆570Updated last week
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆274Updated 2 months ago
- TF-ID: Table/Figure IDentifier for academic papers☆229Updated 7 months ago
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆140Updated 5 months ago
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆887Updated last month
- An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them…☆2,219Updated 2 months ago
- Detect and extract tables to markdown and csv☆728Updated last month
- Lightweight, performant, deep table extraction☆422Updated this week
- A Comprehensive Benchmark for Document Parsing and Evaluation☆261Updated last week
- Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ☆601Updated 3 months ago
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,518Updated this week
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF☆791Updated 2 weeks ago
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆265Updated last month
- Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.☆814Updated 4 months ago
- The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.☆141Updated last month
- Convert all of libgen to high quality markdown☆248Updated last year
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆211Updated 2 months ago
- Vision infrastructure to turn complex documents into RAG/LLM-ready data☆1,695Updated this week
- Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the o…☆2,485Updated 8 months ago
- Python bindings to PDFium☆535Updated this week
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.☆632Updated 3 weeks ago
- Parse PDFs into markdown using Vision LLMs☆288Updated 3 weeks ago
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.☆656Updated 9 months ago
- Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜☆1,226Updated last week
- Large scale training of Latex formula recognition model, currently being organized and open source☆48Updated 10 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆175Updated 9 months ago
- UniTable: Towards a Unified Table Foundation Model☆439Updated 9 months ago