PaddlePaddle / PaddleOCRLinks
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
☆65,653Updated this week
Alternatives and similar repositories for PaddleOCR
Users that are interested in PaddleOCR are comparing it to the libraries listed below
Sorting:
- Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and …☆28,463Updated last year
- 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.☆5,353Updated this week
- 超轻量 级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M☆12,240Updated 2 years ago
- yolo3+ocr☆6,110Updated 3 years ago
- Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.☆49,845Updated this week
- Easy-to-use and powerful LLM and SLM library with awesome model zoo.☆12,862Updated last week
- CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scen…☆3,694Updated 2 months ago
- OpenMMLab Text Detection, Recognition and Understanding Toolbox☆4,693Updated last year
- Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model☆8,026Updated 9 months ago
- OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched☆31,925Updated this week
- PaddleFormers is an easy-to-use library of pre-trained large language model zoo based on PaddlePaddle.☆12,942Updated this week
- All-in-One Development Tool based on PaddlePaddle☆5,923Updated this week
- 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal model…☆153,203Updated this week
- PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)☆23,452Updated this week
- Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-ti…☆13,951Updated last month
- FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data process…☆26,379Updated last week
- Tesseract Open Source OCR Engine (main repository)☆71,151Updated last month
- A treasure chest for visual classification and recognition powered by PaddlePaddle☆5,762Updated last month
- OCR, layout analysis, reading order, table recognition in 90+ languages☆18,942Updated last month
- Toolkit for linearizing PDFs for LLM datasets/training☆16,115Updated this week
- PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image ed…☆8,062Updated last year
- OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。☆40,416Updated 2 weeks ago
- 中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽…☆77,570Updated last year
- 中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理☆35,927Updated 2 weeks ago
- An open-source online reverse dictionary.☆7,098Updated 3 years ago
- Easy-to-use image segmentation library with awesome pre-trained model zoo, supporting wide-range of practical tasks in Semantic Segmentat…☆9,227Updated 2 weeks ago
- 🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time☆36,789Updated 3 weeks ago
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆29,543Updated this week
- 开源易用的中文离线OCR,识别率媲美大厂,并且提供了易用的web页面及web的接口,方便人类日常工作使用或者其他程序来调用~☆2,841Updated 2 years ago
- A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations☆16,129Updated this week