alephdata / pdflibLinks
Binary Python bindings for poppler utils for content extraction
☆42Updated 4 years ago
Alternatives and similar repositories for pdflib
Users that are interested in pdflib are comparing it to the libraries listed below
Sorting:
- Utility library to turn country names into ISO two-letter codes☆70Updated last month
- Python wrapper for a C++ Double Metaphone☆15Updated this week
- A maximum-strength name parser for record linkage.☆37Updated 3 weeks ago
- agate-sql adds SQL read/write support to agate.☆18Updated 4 months ago
- Provide partial dates and retain the date precision through processing☆13Updated 2 years ago
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 4 years ago
- International Address formatter which considers the standard formatting rules of the country☆26Updated 3 years ago
- A search engine for Open Data☆53Updated 2 years ago
- Scalable String Similarity Joins in Python☆39Updated 11 months ago
- Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.☆62Updated last year
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆18Updated 2 years ago
- Transform flat data structures into nested object graphs matching JSON schema definitions.☆28Updated 8 years ago
- Inspect a URL and estimate if it contains a news story☆39Updated 7 months ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Tools for generating CSV and other flat versions of the structured data☆107Updated 2 months ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 4 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆16Updated 3 weeks ago
- Python library for reading and writing tabular data via streams.☆238Updated 4 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 5 years ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- A browser user interface for manual labeling of record pairs.☆47Updated 2 years ago
- A library for extracting tables from PDF files☆90Updated 4 years ago
- Auto-generate Python APIs from JSON schema specifications☆79Updated 5 years ago
- A python module that will check for package updates.☆28Updated 3 years ago
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- A web application that identifies party in political discourse and an example of operationalized machine learning.☆28Updated 6 years ago
- Implement SQLite table-valued functions with Python☆59Updated last year
- Datasette plugin for visualizing data using Vega☆59Updated last year
- Python language parser for a tabular format for structured metadata. http://metatab.org☆18Updated last year