PublicI / pdf-gcv-ocrLinks
Tool to OCR PDFs using Google Cloud Vision
☆42Updated 2 years ago
Alternatives and similar repositories for pdf-gcv-ocr
Users that are interested in pdf-gcv-ocr are comparing it to the libraries listed below
Sorting:
- Ergonomic line-by-line transcription of scanned text.☆53Updated 4 years ago
- A database of court reporters, tests and other experiments☆114Updated 3 weeks ago
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆107Updated 4 years ago
- an extensible tool to generate hyperlinks from legal citations☆36Updated 11 months ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆196Updated 4 months ago
- A database of courts, tests and other experiments☆92Updated last week
- A collection of regular expressions for matching citations to state, federal, and even international law☆39Updated 4 years ago
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆396Updated last year
- Conversions between various OCR formats☆80Updated 2 years ago
- Tools to process books in a cloud based pipeline system☆62Updated 5 months ago
- The Syriac New Testament in Text-Fabric☆13Updated last year
- Find legal citations in any block of text☆174Updated 2 months ago
- A Twitter data collection and appraisal application.☆51Updated 2 years ago
- Extract case law citations with Node☆59Updated 11 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆299Updated 4 months ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆152Updated last year
- guides and test data for OCR4all☆32Updated 2 years ago
- CollectionBuilder-CSV is a "stand alone" template for creating digital collection and exhibit websites using Jekyll and a metadata CSV.☆33Updated 2 weeks ago
- Comparing warc files☆17Updated 6 years ago
- Web based JavaScript GUI library for proofreading/editing hOCR☆96Updated 7 years ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆39Updated last year
- Efficient hOCR tooling☆46Updated last month
- A financial disclosure data extraction tool.☆18Updated 2 years ago
- Automatic alignment of books between HathiTrust, Internet Archive, Google Books, etc.☆36Updated last week
- The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.☆147Updated last year
- The sequel to Big Cases Bot☆25Updated 3 weeks ago
- Abbreviations for use with the Abbreviation Filter developed for use with Multilingual Zotero.☆18Updated last year
- Social Feed Manager user interface application.☆156Updated last year
- Legal citation extractor, via command line, JavaScript, or HTTP. See a live example at:☆244Updated 5 years ago
- Dockerized development environment for Omeka S☆10Updated last week