PublicI / pdf-gcv-ocrLinks
Tool to OCR PDFs using Google Cloud Vision
☆42Updated 2 years ago
Alternatives and similar repositories for pdf-gcv-ocr
Users that are interested in pdf-gcv-ocr are comparing it to the libraries listed below
Sorting:
- A database of court reporters, tests and other experiments☆118Updated last week
- gcv2hocr converts from Google Cloud Vision OCR output to hocr to make a searchable pdf.☆106Updated 5 years ago
- Ergonomic line-by-line transcription of scanned text.☆54Updated 4 years ago
- A database of courts, tests and other experiments☆95Updated last month
- Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.☆402Updated last year
- Find legal citations in any block of text☆178Updated last month
- Jurisdiction ID and abbreviation data files for using with Jurism and other projects.☆38Updated 2 years ago
- an extensible tool to generate hyperlinks from legal citations☆38Updated last year
- Abbreviations for use with the Abbreviation Filter developed for use with Multilingual Zotero.☆18Updated 2 years ago
- Reading legal authority for the last time☆40Updated 8 months ago
- A collection of regular expressions for matching citations to state, federal, and even international law☆40Updated 4 years ago
- Parser for Gmail exported emails in MBOX format.☆31Updated 6 years ago
- Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)☆196Updated 5 months ago
- Social Feed Manager user interface application.☆156Updated last year
- Make a searchable pdf via Google Cloud Vision OCR☆14Updated 5 years ago
- A free tool to OCR a PDF and add a text "layer" in the original file, making a searchable PDF. Use only open source tools. Please tip!☆298Updated 5 months ago
- A collection of tools for archiving and analysing the internet.☆78Updated 3 years ago
- A simple audio file transcriber that uses the Google Cloud Speech API for transcription.☆26Updated 6 years ago
- Easily display Zotero items on a webpage☆32Updated 2 years ago
- Create local backups of airtable databases☆36Updated 2 years ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆51Updated last week
- 📑 Scripts to repair, verify, OCR, compress, wrangle, crop (etc.) PDFs☆70Updated last year
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆152Updated 2 years ago
- 📚 A compilation of research relevant to Data Together's efforts tackling the general problem of data resilience & interactivity☆97Updated 7 years ago
- Quickly go from a paper court form to a runnable, guided, step-by-step web application powered by Docassemble. Swap out branding and pre-…☆54Updated this week
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 3 years ago
- Textricator is a tool to extract text from documents and generate structured data.☆350Updated 7 months ago
- pythonic interface to the courtlistener api☆20Updated 7 years ago
- Examples for getting started using https://case.law☆69Updated 3 years ago
- Named-Entity Recognition extension for OpenRefine☆29Updated 2 years ago