PublicI / pdf-gcv-ocr
Tool to OCR PDFs using Google Cloud Vision
☆39Updated 2 years ago
Alternatives and similar repositories for pdf-gcv-ocr:
Users that are interested in pdf-gcv-ocr are comparing it to the libraries listed below
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆105Updated 4 years ago
- Make a searchable pdf via Google Cloud Vision OCR☆14Updated 5 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆186Updated 2 weeks ago
- Ergonomic line-by-line transcription of scanned text.☆51Updated 4 years ago
- A Free Database of Legal Materials☆24Updated 5 years ago
- Structured data for classical studies☆18Updated 8 years ago
- A database of court reporters, tests and other experiments☆101Updated this week
- OCRmyPDF EasyOCR plugin☆62Updated 5 months ago
- Master repository which includes most other OCR-D repositories as submodules☆72Updated last week
- Efficient hOCR tooling☆42Updated this week
- guides and test data for OCR4all☆30Updated 2 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆382Updated 6 months ago
- METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)☆54Updated last year
- A collection of regular expressions for matching citations to state, federal, and even international law☆33Updated 3 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- HOCR Specification Python Parser☆13Updated 9 years ago
- Reading legal authority for the last time☆34Updated this week
- A microservice for document conversion at scale☆62Updated 2 weeks ago
- A database of courts, tests and other experiments☆67Updated last week
- Fast PDF generation and compression. Deals with millions of pages daily.☆107Updated 6 months ago
- An online annotation platform for teaching and learning in the humanities.☆107Updated last week
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆36Updated last year
- Conversions between various OCR formats☆74Updated last year
- A browser extension providing Open Access bibliographical services☆14Updated 2 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆55Updated last year
- Named-Entity Recognition extension for OpenRefine☆26Updated 2 years ago
- an extensible tool to generate hyperlinks from legal citations☆33Updated 4 months ago
- A PDF classifier ensemble with REST API service☆23Updated 3 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Process, enhance and evaluate multiple OCR output.☆22Updated 3 months ago