VikParuchuri / texify
Math OCR model that outputs LaTeX and markdown
☆1,006Updated 2 months ago
Alternatives and similar repositories for texify:
Users that are interested in texify are comparing it to the libraries listed below
- Formula recognition based on LaTeX-OCR and ONNXRuntime.☆323Updated 2 months ago
- Extract structured text from pdfs quickly☆393Updated this week
- TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability,…☆433Updated 5 months ago
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆263Updated last month
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆135Updated 4 months ago
- An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them…☆2,141Updated last month
- Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community☆556Updated last month
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆798Updated 2 weeks ago
- Detect and extract tables to markdown and csv☆723Updated this week
- Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ☆573Updated last month
- Lightweight, performant, deep table extraction☆393Updated last month
- UniTable: Towards a Unified Table Foundation Model☆413Updated 7 months ago
- A Comprehensive Benchmark for Document Parsing and Evaluation☆211Updated last week
- A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The servic…☆236Updated this week
- TF-ID: Table/Figure IDentifier for academic papers☆228Updated 6 months ago
- Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.☆2,387Updated 5 months ago
- Vision model based document ingestion☆1,312Updated this week
- Easily deployable 🚀 API to convert PDF to markdown quickly with high accuracy.☆796Updated 3 months ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆195Updated last month
- Fast Semantic Text Deduplication☆472Updated this week
- Generate a comprehensive review from an arXiv paper, then turn it into a blog post. This project powers the website below for the Hugging…☆723Updated last week
- AI Powered Image search tool offers content-based, text, and visual similarity system-wide search.☆228Updated last month
- Large scale training of Latex formula recognition model, currently being organized and open source☆46Updated 9 months ago
- A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.☆160Updated 8 months ago
- DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis☆308Updated last year
- Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from R…☆319Updated this week
- Things you can do with the token embeddings of an LLM☆1,419Updated 3 weeks ago
- This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.☆599Updated last month
- Another LaTex equation OCR tool based on ConvNeXt and Transformer☆47Updated last year
- RAG (Retrieval-Augmented Generation) Chatbot Examples Using PyMuPDF☆728Updated 2 months ago