hay / dataknead
Effortless conversion between data formats like JSON, XML and CSV
☆118Updated 2 years ago
Related projects: ⓘ
- Tools for generating CSV and other flat versions of the structured data☆103Updated this week
- Binary Python bindings for poppler utils for content extraction☆42Updated 3 years ago
- Dump (freeze) SQL query results from a database into a selection of file formats☆91Updated 5 years ago
- Datasette plugin that shows a map for any data with latitude/longitude columns☆87Updated last month
- A simple command line interface to the datamade/dedupe library.☆42Updated last year
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆142Updated 7 months ago
- Python functions for flattening a JSON object to a single dictionary of pairs, and unflattening that dictionary back to a JSON object☆43Updated 2 weeks ago
- Datasette plugin for visualizing data using Vega☆58Updated last year
- Python library for reading and writing tabular data via streams.☆235Updated 3 years ago
- THIS REPOSITORY IS FORK☆30Updated last year
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.☆62Updated 7 years ago
- Guess gender from first name in Python 2 and 3☆129Updated 2 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 8 months ago
- A modern Python library for writing maintainable web scrapers.☆244Updated 2 months ago
- Street address parser and formatter☆92Updated 5 years ago
- Generate SQL tables, load and extract data, based on JSON Table Schema descriptors.☆60Updated last year
- A helper library full of URL-related heuristics.☆56Updated 2 weeks ago
- ⛏ a library for scraping unreliable pages☆208Updated last month
- A pure Python Levenshtein implementation that's not freaking GPL'd.☆97Updated last year
- Generate Pandas frames, load and extract data, based on JSON Table Schema descriptors.☆52Updated 3 years ago
- Partial result caching for pandas in Python.☆18Updated 5 years ago
- Python library and command line tool for converting data from one format to another☆100Updated 4 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- python functions for applied use of schema.org☆34Updated 2 years ago
- Save an RSS or ATOM feed to a SQLite database☆46Updated last year
- A Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON…☆106Updated last year
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆25Updated 3 years ago
- Scrapers for disaster data - writes to https://github.com/simonw/disaster-data☆49Updated 7 months ago
- Framework for processing data packages in pipelines of modular components.☆118Updated last year
- CLI tool for fetching data using HTTP conditional get☆14Updated 3 years ago