RapidAI/RapidOCRPDF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RapidAI/RapidOCRPDF)

RapidAI / RapidOCRPDF

Based on RapidOCR, extract the PDF content

☆191

Alternatives and similar repositories for RapidOCRPDF

Users that are interested in RapidOCRPDF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RapidAI / RapidOCR
View on GitHub
📄 Awesome OCR multiple programing languages toolkits based on ONNX Runtime, OpenVINO, MNN, PaddlePaddle, TensorRT and PyTorch.
☆7,214Jul 9, 2026Updated last week
RapidAI / RapidOrientation
View on GitHub
文档方向分类
☆221Feb 3, 2026Updated 5 months ago
RapidAI / RapidLayout
View on GitHub
Analysis of Chinese and English layouts 中英文版面分析
☆275Mar 24, 2026Updated 3 months ago
RapidAI / RapidDocEx
View on GitHub
📝 针对文档类图像做内容提取，将文档类图像一比一输出到Word或者Txt中，便于进一步使用或处理。后续计划支持输入PDF/图像，输出对应json格式、Txt格式、Word格式和Markdown格式。
☆208Nov 1, 2024Updated last year
billikeu / Go-EdgeGPT
View on GitHub
Go-EdgeGPT: Reverse engineered API of Microsoft's Bing Chat AI. 新必应聊天功能的逆向工程
☆15Apr 10, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
RapidAI / TableStructureRec
View on GitHub
整理目前开源的最优表格识别模型，完善前后处理，模型转换为ONNX | Organize the currently open-source optimal table recognition models, improve pre-processing and post-…
☆954Aug 3, 2025Updated 11 months ago
jiangnanboy / pdf_to_docx
View on GitHub
ocr，pdf转docx，pdf to docx
☆23Nov 4, 2022Updated 3 years ago
RapidAI / RapidUnDistort
View on GitHub
修正文档扭曲/模糊/阴影等情况，使用onnx模型简单轻量部署，未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We wi…
☆105Dec 17, 2025Updated 7 months ago
RapidAI / RapidTable
View on GitHub
基于序列表格识别算法推理库，集成PP-Structure和modelscope等表格识别算法。
☆432Apr 23, 2026Updated 2 months ago
ck-unifr / pdf_parsing
View on GitHub
PDF解析（文字，章节，表格，图片，参考），基于大模型(ChatGLM2-6B, RWKV)+langchain+streamlit的PDF问答，摘要，信息抽取
☆211Oct 17, 2023Updated 2 years ago
RapidAI / RapidLaTeXOCR
View on GitHub
Formula recognition based on LaTeX-OCR and ONNXRuntime.
☆388Nov 3, 2024Updated last year
ayjin-dev / cnkispider
View on GitHub
cnki，中国知网，论文，论文下载，摘要查询
☆16Mar 31, 2020Updated 6 years ago
leduclinh7141 / BetterWhisperX
View on GitHub
☆22Nov 15, 2024Updated last year
RapidAI / PaddleOCRModelConvert
View on GitHub
Convert the model in PaddleOCR to ONNX format
☆120Jul 15, 2025Updated last year
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Naoghuman / lib-fxml
View on GitHub
The `Lib-FXML` library simplifies the loading of [JavaFX] relevant files (model, view, controller, .fxml, .css, .properties) and enables …
☆21Oct 13, 2020Updated 5 years ago
Tanmoy001 / OCROfBankStatement
View on GitHub
OCR of Bank Statement - Infosys Springboard 5.0. This this project I developed a fully functional website for OCR of bank or financial do…
☆12Jan 4, 2025Updated last year
liu-qingyuan / faster_whisper_gradio
View on GitHub
Real time faster whisper gradio
☆24Aug 17, 2025Updated 11 months ago
jiangnanboy / layout_analysis
View on GitHub
中文版面检测（Chinese layout detection），yolov8 is used to detect the layout of Chinese document images。
☆60Apr 28, 2023Updated 3 years ago
jiangnanboy / doc_ai
View on GitHub
这里将paddle中的ocr等模型转为onnx格式，并利用java版深度框架djl加载这些onnx模型进行推理预测尝试。
☆14Nov 15, 2022Updated 3 years ago
OverflowCat / paddleocr
View on GitHub
A simple wrapper for hiroi-sora/PaddleOCR-json.
☆17Oct 20, 2023Updated 2 years ago
jiangnanboy / llm_corpus_quality
View on GitHub
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning
☆80Jul 25, 2024Updated last year
Gmgge / ImageAnalysisService
View on GitHub
轻量模型的图像分析web服务，包括倾斜矫正OCR，公章(印章)检测+识别，车牌识别。api方案使用FastAPI+Gunicorn，提供gradio展示。
☆103Apr 30, 2024Updated 2 years ago
ArminKmz / im2latex
View on GitHub
Pytorch implementation of math equation images to latex markup language.
☆30Oct 25, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hiroi-sora / PaddleOCR-json
View on GitHub
OCR离线图片文字识别命令行windows程序，以JSON字符串形式输出结果，方便别的程序调用。提供各种语言API。由 PaddleOCR C++ 编译。
☆1,528Apr 7, 2025Updated last year
LlamaEdge / chatbot-ui
View on GitHub
☆25Apr 29, 2025Updated last year
ElvisClaros / GOT-OCR2.0
View on GitHub
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
☆23Sep 26, 2024Updated last year
chu-tianxiang / vllm-gptq
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆131Jun 25, 2024Updated 2 years ago
alisen39 / TrWebOCR
View on GitHub
开源易用的中文离线OCR，识别率媲美大厂，并且提供了易用的web页面及web的接口，方便人类日常工作使用或者其他程序来调用~
☆2,879Jun 14, 2023Updated 3 years ago
RapidAI / RapidRAG
View on GitHub
QA based on local knowledge and LLM.
☆250Jan 16, 2026Updated 6 months ago
opendatalab / magic-doc
View on GitHub
☆549Jul 26, 2024Updated last year
RUCKBReasoning / TableLLM
View on GitHub
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
☆252Aug 20, 2025Updated 11 months ago
breezedeus / CnSTD
View on GitHub
CnSTD: 基于 PyTorch/MXNet 的中文/英文场景文字检测（Scene Text Detection）、数学公式检测（Mathematical Formula Detection, MFD）、篇章分析（Layout Analysis）的Python3 包
☆792Jul 5, 2026Updated 2 weeks ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
RapidAI / RapidTTS
View on GitHub
轻量级文本转语音工具，面向本地快速推理。A text-to-speech framework for fast and high-quality speech synthesis.
☆55May 29, 2026Updated last month
shaotin / dm
View on GitHub
多轮任务对话管理器状态机
☆22Oct 13, 2020Updated 5 years ago
cvlab-stonybrook / PaperEdge
View on GitHub
The code and the DIW dataset for "Learning From Documents in the Wild to Improve Document Unwarping" (SIGGRAPH 2022)
☆137Jul 28, 2024Updated last year
infinigence / InfiniWebSearch
View on GitHub
A demo built on Megrez-3B-Instruct, integrating a web search tool to enhance the model's question-and-answer capabilities.
☆39Dec 15, 2024Updated last year
hiroi-sora / GapTree_Sort_Algorithm
View on GitHub
【间隙·树·排序算法】对OCR结果或PDF提取的文本进行版面分析，按人类阅读顺序进行排序。
☆167Feb 28, 2024Updated 2 years ago
binghe001 / mykit-concurrent-jdk
View on GitHub
🔥🔥🔥《深入理解高并发编程：JDK核心技术》随书源码
☆20Apr 6, 2023Updated 3 years ago
MetaGLM / FinGLM
View on GitHub
FinGLM: 致力于构建一个开放的、公益的、持久的金融大模型项目，利用开源开放来促进「AI+金融」。
☆2,253May 8, 2024Updated 2 years ago