dannguyen / abbyy-finereader-ocr-senate
Evaluating the performance and accuracy of ABBYY FineReader's OCR on Senate Financial Disclosure scanned forms
☆130Updated 8 years ago
Alternatives and similar repositories for abbyy-finereader-ocr-senate:
Users that are interested in abbyy-finereader-ocr-senate are comparing it to the libraries listed below
- A collection of tools for mining government data☆140Updated 8 years ago
- Extract tables from PDF files☆356Updated 8 years ago
- Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily☆112Updated 9 years ago
- A library for extracting tables from PDF files☆90Updated 11 years ago
- NICAR 2016 talk about PDFs!☆62Updated 8 years ago
- Extract tabular data and semantically discover it with ease! (OS)☆21Updated 8 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- Repository for PyCon 2016 workshop Natural Language Processing in 10 Lines of Code☆239Updated 7 years ago
- online natural language processing with word vectors☆309Updated 7 months ago
- We introduce TACIT: An Open-Source Text Analysis, Crawling and Interpretation Tool. TACIT's plugin architecture has three main components…☆107Updated 5 years ago
- ☆89Updated 9 years ago
- Examples for http://dataviztalk.blogspot.com☆21Updated 9 years ago
- Code to transform Hillary's emails from raw PDF documents to a SQLite database☆161Updated 9 years ago
- ☆24Updated 9 years ago
- Backend of Common Search. Analyses webpages and sends them to the index.☆122Updated 7 years ago
- A simple viewer and inspection tool for text boxes in PDF documents☆94Updated 2 years ago
- using XPDF, pdftojson extracts text from PDF files as JSON, including word bounding boxes.☆144Updated last year
- Create simple APIs from CSV files☆193Updated 4 years ago
- Tool for visual exploration of complex data.☆191Updated 6 years ago
- Using ML to extract campaign finance data from messy forms for journalism☆76Updated 2 years ago
- Ocular is a state-of-the-art historical OCR system.☆258Updated 8 months ago
- A python server harnessing the calculational ability of LibreOffice Calc (thanks to 'pyoo'). It provides 'instant' access to the cell ran…☆138Updated last year
- Extract tables from PDF pages.☆283Updated 4 years ago
- A place to collect and share knowledge about liberating data from PDFs☆54Updated 3 years ago
- Tools to download and process name data from various sources.☆90Updated 11 years ago
- Loan-level analysis of Fannie Mae and Freddie Mac data☆217Updated 4 years ago
- Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.☆1,272Updated 4 years ago
- A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools…☆293Updated 2 years ago
- Pragmatic & Practical Bayesian Sentiment Classifier☆219Updated 7 years ago
- Automatically exported from code.google.com/p/lector☆129Updated 6 years ago