jamesturk / spatula
A modern Python library for writing maintainable web scrapers.
☆250Updated 10 months ago
Alternatives and similar repositories for spatula
Users that are interested in spatula are comparing it to the libraries listed below
Sorting:
- ⛏ a library for scraping unreliable pages☆211Updated 8 months ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆151Updated 4 months ago
- ☆14Updated last year
- Python library and CLI you can use to move relational data from one place to another - DBs/CSV/gsheets/dataframes/...☆37Updated 10 months ago
- Find your broken links, so users don't.☆67Updated last week
- Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.☆110Updated 6 months ago
- A Python module for accessing the Open States API☆29Updated last year
- ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.☆100Updated 2 years ago
- The data journalism platform with built in training☆305Updated 5 months ago
- Now included in rigour☆151Updated last week
- Platform for journalists to search, analyse, categorise and share unstructured data☆55Updated 3 weeks ago
- Utility library to turn country names into ISO two-letter codes☆66Updated 2 months ago
- Extract city and country mentions from Text like GeoText without regex, but FlashText, a Aho-Corasick implementation.☆62Updated this week
- General programming utilities from Pew Research Center☆69Updated 3 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- Add website scraping abilities to Datasette☆62Updated 2 years ago
- Parse government documents into well formed JSON☆68Updated last week
- Datasette of earning call transcripts from the Motley Fool☆15Updated 2 years ago
- A Python implementation of Lunr.js 🌖☆195Updated 2 months ago
- Save an RSS or ATOM feed to a SQLite database☆51Updated 2 years ago
- Datasette plugin providing data dashboards from metadata☆146Updated last month
- Core library for the datakit CLI framework.☆55Updated 2 years ago
- Opinionated cookiecutter template for creating a new Python library☆196Updated last month
- Datasette plugin that shows a map for any data with latitude/longitude columns☆93Updated 8 months ago
- An open-source archive that gathers, saves, shares and analyzes news homepages☆139Updated 4 months ago
- Clean up all those Pythons crawling around your computer☆15Updated 2 years ago
- Group thousands of similar spreadsheet or database text entries in seconds☆155Updated last year
- Opinionated template for Django projects on Python 3 and PostgreSQL☆24Updated 7 years ago
- a python parser for the .fec file format☆45Updated last week
- An extremely fast FEC filing parser written in C☆76Updated 2 weeks ago