18F / doc_processing_toolkit
Python library to extract text from PDF, and default to OCR when text extraction fails.
☆61Updated 7 years ago
Alternatives and similar repositories for doc_processing_toolkit:
Users that are interested in doc_processing_toolkit are comparing it to the libraries listed below
- A basic spreadsheet to api engine☆42Updated 5 years ago
- Please check out https://github.com/18F/foia-hub/issues to track our work. This repo is for project wide discussion, blogging, and scratc…☆51Updated 6 years ago
- This is a list of various datasets that are collected by States initially and then provided to federal agencies.☆20Updated 3 years ago
- Friendly Slack bot for looking up cases☆21Updated 7 years ago
- framework for scraping legislative/government data☆85Updated 5 months ago
- Legal codes, for humans.☆254Updated 3 years ago
- Turns legal citations in the DOM into links☆20Updated 7 years ago
- Collecting reports from Inspectors General across the US federal government.☆109Updated 4 years ago
- ReVAL: Reusable Validation Library - A Django App for validating data via API and web interface☆32Updated 3 years ago
- “Let Me Get That Data For You” catalogs the machine-readable data on a given domain name. [RETIRED]☆102Updated 9 years ago
- A complete agency API program.☆12Updated 7 years ago
- Unified Python bindings for Sunlight APIs☆66Updated 8 years ago
- A project focused on tools and best practices to supported federated data collection efforts☆28Updated 4 years ago
- We use Tock to track and report our time at 18F☆121Updated 3 weeks ago
- Importer for US Spending data☆33Updated 10 years ago
- Scrapers for US municipal governments.☆100Updated 8 months ago
- legacy backend for Open States☆87Updated 5 years ago
- ☆11Updated 9 years ago
- Coding space for the LegisLetters project.☆12Updated 9 years ago
- DEPRECATED See https://github.com/18F/fec-cms for fec.gov's code☆43Updated 7 years ago
- Slides for 18F - built automatically using Federalist☆30Updated 7 years ago
- Scraping, parsing and indexing the daily Congressional Record to support phrase search over time, and by legislator and date☆122Updated 2 years ago
- A Python web application for converting PDF forms into PDF-filling APIs☆46Updated 4 years ago
- US Digital Registry.☆79Updated 2 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- A deprecated Python wrapper for the DocumentCloud API☆63Updated 4 years ago
- a web app for keeping tabs on city council activity in New York City☆39Updated 5 years ago
- A consolidated FOIA request hub.☆49Updated 6 years ago
- Suggestions, schedules, and other information about the Engineering Chapter's Tech Talk meetings.☆28Updated last year
- Parser for U.S. federal regulations and other regulatory information☆39Updated last year