dannguyen / abbyy-finereader-ocr-senate
Evaluating the performance and accuracy of ABBYY FineReader's OCR on Senate Financial Disclosure scanned forms
☆131Updated 9 years ago
Alternatives and similar repositories for abbyy-finereader-ocr-senate
Users that are interested in abbyy-finereader-ocr-senate are comparing it to the libraries listed below
Sorting:
- Extract tables from PDF files☆356Updated 9 years ago
- A collection of tools for mining government data☆140Updated 8 years ago
- NICAR 2016 talk about PDFs!☆62Updated 9 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆113Updated 9 years ago
- A Python web application for converting PDF forms into PDF-filling APIs☆46Updated 4 years ago
- online natural language processing with word vectors☆309Updated 10 months ago
- Extract tabular data and semantically discover it with ease! (OS)☆21Updated 9 years ago
- A library for extracting tables from PDF files☆89Updated 11 years ago
- Tool for visual exploration of complex data.☆191Updated 6 years ago
- Python library to extract text from PDF, and default to OCR when text extraction fails.☆62Updated 7 years ago
- ☆89Updated 9 years ago
- Create simple APIs from CSV files☆194Updated 4 years ago
- TensorFlow for AWS☆116Updated 9 years ago
- Mechanical Turk on your own machine.☆206Updated 6 months ago
- A lightweight server to allow HTTP requests to the Stanford Named Entity Recognized and a heavily modified CLAVIN geoparser.☆119Updated 2 years ago
- Analyzes a CSV file and generates database table schema, all within the browser☆315Updated 9 years ago
- Code to transform Hillary's emails from raw PDF documents to a SQLite database☆161Updated 9 years ago
- ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (image…☆95Updated 6 years ago
- Repository for PyCon 2016 workshop Natural Language Processing in 10 Lines of Code☆239Updated 7 years ago
- All stories and comments posted on Hacker News upto May 29, 2014☆128Updated 6 years ago
- Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.☆123Updated 7 years ago
- Loan-level analysis of Fannie Mae and Freddie Mac data☆219Updated 5 years ago
- A framework for visualizing parent-child relationships with d3js☆116Updated 7 years ago
- Download Hillary Clinton's emails and query them with sqlite☆153Updated 5 years ago
- We introduce TACIT: An Open-Source Text Analysis, Crawling and Interpretation Tool. TACIT's plugin architecture has three main components…☆107Updated 6 years ago
- Supervised learning for novelty detection in text☆78Updated 8 years ago
- Python workers that collect tweets from the twitter streaming api and track deletions☆128Updated 2 years ago
- make it easy to turn a lot of potentially large csv files into easily accessible open data☆198Updated 8 years ago
- A toolbox and web application for working with and presenting textual material from Shakespeare to Schopenhauer, and letters to literatur…☆149Updated 10 years ago