tesseract-ocr / tessdataLinks
Trained models with fast variant of the "best" LSTM models + legacy models
☆7,003Updated last year
Alternatives and similar repositories for tessdata
Users that are interested in tessdata are comparing it to the libraries listed below
Sorting:
- Best (most accurate) trained LSTM models.☆1,372Updated last year
- Source training data for Tesseract for lots of languages☆857Updated 2 months ago
- Tesseract Open Source OCR Engine (main repository)☆67,793Updated 3 weeks ago
- Fast integer versions of trained LSTM models☆549Updated 10 months ago
- Leptonica is an open source library containing software that is broadly useful for image processing and image analysis applications. The …☆1,925Updated last month
- Tesseract Open Source OCR Engine (main repository)☆3,593Updated this week
- Tesseract documentation☆2,086Updated 3 weeks ago
- Train Tesseract LSTM with make☆683Updated 2 months ago
- Line based ATR Engine based on OCRopy☆1,148Updated last month
- Links to awesome OCR projects☆2,989Updated 11 months ago
- Java OCR 识别组件(基于Tesseract OCR 引擎)。能自动完成图片清理、识别 CAPTCHA 验证码图片内容的一体化工作。Java Image cleanup, OCR recognition component (based Tesseract OCR e…☆619Updated 3 years ago
- 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.☆4,340Updated this week
- A framework like Celery!☆2Updated 2 years ago
- OCR engine for all the languages☆841Updated last week
- Various documents related to Tesseract OCR☆266Updated 3 years ago
- Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languag…☆50,836Updated this week
- CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scen…☆3,578Updated 6 months ago
- Open source Python library for converting PDF to DOCX.☆2,996Updated last month
- A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.☆1,457Updated 10 months ago
- Convert PDF to HTML without losing text or format.☆10,498Updated 2 years ago
- Simple PDF text extraction☆941Updated 4 months ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆394Updated 10 months ago
- yolo3+ocr☆6,076Updated 2 years ago
- finetuned traineddata files for tesseract 4.0.0 for testing☆165Updated 6 years ago
- Box editor and trainer for Tesseract OCR☆242Updated this week
- 超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M☆12,154Updated last year
- ABBYY Cloud OCR SDK☆517Updated 2 years ago
- TableBank: A Benchmark Dataset for Table Detection and Recognition☆1,062Updated 10 months ago
- PPOCRLabelv2 is a semi-automatic graphic annotation tool suitable for OCR field, with built-in PP-OCR model to automatically detect and r…☆232Updated last week
- 开源易用的中文离线OCR,识别率媲美大厂,并且提供了易用的web页面及web的接口,方便人类日常工作使用或者其他程序来调用~☆2,753Updated 2 years ago