REMitchell / tesseract-trainerLinks
Tools used to generate training files for the Tesseract OCR project
☆89Updated last year
Alternatives and similar repositories for tesseract-trainer
Users that are interested in tesseract-trainer are comparing it to the libraries listed below
Sorting:
- Python wrapper for the tesseract OCR engine. The module is based on OpenCV☆177Updated 7 years ago
- Scrapy extension to control spiders using JSON-RPC☆300Updated 5 years ago
- CSS Selectors for Python☆298Updated last month
- Useful test spiders for Scrapy☆185Updated 5 years ago
- Scrapy Middleware to set a random User-Agent for every Request.☆202Updated 5 years ago
- Scrapy spider middleware to ignore requests to pages containing items seen in previous crawls☆273Updated 4 months ago
- PhantomJS Downloader for Scrapy, Yeah!☆94Updated 10 years ago
- An elementary captcha decoder written in python☆155Updated 9 years ago
- ☆129Updated 6 years ago
- Python library of web-related functions☆405Updated last month
- Minimal realtime chat application ( Tutorial )☆132Updated 13 years ago
- Sometimes sites make crawling hard. Selenium-crawler uses selenium automation to fix that.☆125Updated 12 years ago
- Python module for JSON data encoding, including jsonlint. See the project Wiki here on Github. Also read the README at the bottom of th…☆303Updated 5 years ago
- Zyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy☆364Updated 3 months ago
- The Easiest Way to Present Online☆45Updated 7 years ago
- Use pyppeteer from a Scrapy spider☆59Updated 5 years ago
- This repository store some example to learn scrapy better☆176Updated 4 years ago
- Apr9_NYU☆36Updated 12 years ago
- Scrapy Book 2nd Edition Code http://scrapybook.com/☆49Updated 3 years ago
- Web Crawling UI and HTTP API, based on Scrapy and Tornado☆162Updated 2 years ago
- Generator of User-Agent header☆340Updated last year
- A dynamic configurable news crawler based Scrapy☆165Updated 7 years ago
- Utilities for working with Excel files that require both xlrd and xlwt.☆271Updated 6 years ago
- Random User-Agent middleware based on fake-useragent☆695Updated last year
- Redis-based components for scrapy that allows distributed crawling☆46Updated 10 years ago
- Python API for parsehub.com web scraping service☆46Updated 7 years ago
- Python 3 port of pdfminer☆186Updated 6 years ago
- GtWeb Python Sdk☆83Updated 8 years ago
- Scrapy project based on dirbot to show how to use Twisted's adbapi to store the scraped data in MySQL.☆117Updated 11 years ago
- Output scrapy statistics to graphite/carbon☆54Updated 12 years ago