PublicI / pdf-gcv-ocrLinks
Tool to OCR PDFs using Google Cloud Vision
☆42Updated 2 years ago
Alternatives and similar repositories for pdf-gcv-ocr
Users that are interested in pdf-gcv-ocr are comparing it to the libraries listed below
Sorting:
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆106Updated 5 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆299Updated 6 months ago
- Ergonomic line-by-line transcription of scanned text.☆54Updated 4 years ago
- A database of court reporters, tests and other experiments☆117Updated this week
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆196Updated 6 months ago
- Web based JavaScript GUI library for proofreading/editing hOCR☆100Updated 7 years ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆154Updated 2 years ago
- Efficient hOCR tooling☆52Updated 3 months ago
- Fast PDF generation and compression. Deals with millions of pages daily.☆126Updated 2 months ago
- an extensible tool to generate hyperlinks from legal citations☆39Updated last year
- Conversions between various OCR formats☆82Updated 2 years ago
- OCRmyPDF EasyOCR plugin☆93Updated 2 months ago
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆38Updated 2 years ago
- Tools to process books in a cloud based pipeline system☆64Updated 7 months ago
- Named-Entity Recognition extension for OpenRefine☆29Updated 3 years ago
- A database of courts, tests and other experiments☆96Updated last month
- Working with hOCR in Javascript☆136Updated 2 years ago
- Reading legal authority for the last time☆41Updated 8 months ago
- A collection of regular expressions for matching citations to state, federal, and even international law☆40Updated 4 years ago
- Comparing warc files☆17Updated 6 years ago
- Social Feed Manager user interface application.☆156Updated last year
- guides and test data for OCR4all☆32Updated 3 years ago
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in t…☆131Updated last week
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆52Updated this week
- Structured data for classical studies☆19Updated 9 years ago
- A collection of tools for archiving and analysing the internet.☆78Updated 3 years ago
- Core development repository. gitHub: Vsn 6 (2020 - ), Vsn 5 (2018 - 2020), Vsn 4 (2014-2017). Sourceforge: Vsn 3 (2009-2013), Vsn 1 & 2 (…☆65Updated last week
- Examples for getting started using https://case.law☆69Updated 3 years ago
- The CIS OCR PostCorrectionTool☆44Updated 3 years ago
- A Twitter data collection and appraisal application.☆51Updated 2 years ago