An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
☆3,087Feb 7, 2026Updated 2 months ago
Alternatives and similar repositories for Pix2Text
Users that are interested in Pix2Text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- pix2tex: Using a ViT to convert images of equations into LaTeX code.☆16,295Jan 18, 2025Updated last year
- 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reaso…☆1,292Jun 11, 2024Updated last year
- TexTeller can convert image to latex formulas (image2latex, latex OCR) with higher accuracy and exhibits superior generalization ability,…☆729Aug 22, 2025Updated 7 months ago
- Formula recognition based on LaTeX-OCR and ONNXRuntime.☆383Nov 3, 2024Updated last year
- CnSTD: 基于 PyTorch/MXNet 的 中文/英文 场景文字检测(Scene Text Detection)、数学公式检测(Mathematical Formula Detection, MFD)、篇章分析(Layout Analysis)的Python3 包☆786Feb 7, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.☆1,615Apr 24, 2025Updated 11 months ago
- Implementation of Nougat Neural Optical Understanding for Academic Documents☆9,894Feb 21, 2025Updated last year
- Math OCR model that outputs LaTeX and markdown☆1,117Jan 29, 2025Updated last year
- UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition☆463Sep 28, 2025Updated 6 months ago
- A Comprehensive Toolkit for High-Quality PDF Content Extraction☆9,562Jan 3, 2025Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,109Feb 10, 2025Updated last year
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆58,131Apr 3, 2026Updated last week
- Chinese Mathematical Formula Detection (MFD) Dataset 中文文档数学公式检测数据集☆34Dec 21, 2022Updated 3 years ago
- Convert PDF to markdown + JSON quickly with high accuracy☆33,352Apr 4, 2026Updated last week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [ECCV 2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.☆1,894Dec 30, 2024Updated last year
- A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team…☆1,823Mar 17, 2026Updated 3 weeks ago
- A High-efficiency Open-source Toolkit for Table-to-Latex Task☆275Dec 6, 2025Updated 4 months ago
- 数学公式识别 Math Formula OCR☆551Mar 24, 2023Updated 3 years ago
- Using GPT to parse PDF☆3,553Apr 17, 2025Updated 11 months ago
- DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception☆2,099Apr 14, 2025Updated 11 months ago
- OCR, layout analysis, reading order, table recognition in 90+ languages☆19,557Apr 3, 2026Updated last week
- Convert images of LaTex math equations into LaTex code.☆2,162Oct 4, 2022Updated 3 years ago
- Markdown rendering + Latex extras (equations, tables, ...), with conversion features, for the scientific community☆660Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- 1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection(公式检测冠军方案)☆133Sep 4, 2023Updated 2 years ago
- CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scen…☆3,742Feb 7, 2026Updated 2 months ago
- 360LayoutAnaylsis, a series Document Analysis Models and Datasets deleveped by 360 AI Research Institute☆306Sep 10, 2024Updated last year
- [EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,…☆32,758Updated this week
- 为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型…☆70,369Jan 25, 2026Updated 2 months ago
- Codebase for fine-tuning / evaluating nougat-based image2latex generation models☆160Sep 25, 2024Updated last year
- 基于Pytorch实现的End-to-End图像Latex公式识别 inspire by LinXueyuanStdio/LaTeX_OCR_PRO☆179Apr 6, 2020Updated 6 years ago
- mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding☆2,373May 30, 2025Updated 10 months ago
- OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。☆42,988Nov 20, 2025Updated 4 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster).☆385Aug 5, 2024Updated last year
- Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/…☆75,347Updated this week
- LaTeX OCR 的数据仓库☆140Jun 11, 2024Updated last year
- translate scientific papers in latex, especially arxiv papers☆1,356Sep 26, 2025Updated 6 months ago
- 📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.☆6,288Updated this week
- Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复☆19,369Mar 2, 2026Updated last month
- CDLA: A Chinese document layout analysis (CDLA) dataset☆290Sep 13, 2021Updated 4 years ago