alephdata / pdflib
Binary Python bindings for poppler utils for content extraction
☆42Updated 3 years ago
Alternatives and similar repositories for pdflib:
Users that are interested in pdflib are comparing it to the libraries listed below
- Provide partial dates and retain the date precision through processing☆13Updated 2 years ago
- Python wrapper for a C++ Double Metaphone☆15Updated 2 years ago
- A Datasette plugin providing an MLOps platform to train, eval and predict machine learning models☆15Updated this week
- A maximum-strength name parser for record linkage.☆36Updated last week
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 3 years ago
- An alpha project combining beneficial ownership and contracting data☆13Updated 3 years ago
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆17Updated 2 years ago
- Write Datasette canned queries as plain SQL files☆13Updated 2 years ago
- Utility library to turn country names into ISO two-letter codes☆66Updated this week
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.☆62Updated last year
- Date parsing and normalization utilities for Python.☆22Updated last year
- View Vega/Vega-Lite plots in your web browser from local or remote Python processes.☆35Updated 3 weeks ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- Python parser for the Archie Markup Language (ArchieML)☆12Updated 3 years ago
- Tool to dump all GPS traces collected by/for the OpenStreetMap project.☆25Updated 5 years ago
- A browser user interface for manual labeling of record pairs.☆44Updated last year
- Command line tool to convert spreadsheets to databases, made for the UK's Office for National Statistics.☆78Updated last year
- Render a map for any query with a geometry column☆25Updated 6 months ago
- Python and pandas tools to perform various analyses on different types of word lists☆16Updated 10 years ago
- agate-sql adds SQL read/write support to agate.☆19Updated 2 weeks ago
- A simple command line interface to the datamade/dedupe library.☆42Updated 2 years ago
- Slideshow template for Voilà based on RevealJS☆16Updated 3 years ago
- A repository of materials for a proposed class on automated story bots.☆49Updated 6 years ago
- A Python library for defining rule-based overrides on messy data☆13Updated 3 months ago
- data wrangling simplicity, complete audit transparency, and at speed☆34Updated last week
- Web interface for network analysis.☆21Updated 2 years ago
- Commons of stupid, simple Python micro functions. Pull requests very welcome.☆19Updated 2 years ago
- Python library and command line tool for converting data from one format to another☆99Updated 4 years ago
- Statistical visualizations for Datasette using Seaborn☆11Updated 2 years ago