jamesturk / spatula
A modern Python library for writing maintainable web scrapers.
☆245Updated 7 months ago
Alternatives and similar repositories for spatula:
Users that are interested in spatula are comparing it to the libraries listed below
- ⛏ a library for scraping unreliable pages☆210Updated 5 months ago
- A Python module for accessing the Open States API☆29Updated last year
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆147Updated last month
- Python library and CLI you can use to move relational data from one place to another - DBs/CSV/gsheets/dataframes/...☆37Updated 7 months ago
- A general purpose tool for text-based crosswalking☆103Updated 10 months ago
- General programming utilities from Pew Research Center☆69Updated 3 years ago
- Django app for building dashboards using raw SQL queries☆447Updated 11 months ago
- Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.☆107Updated 3 months ago
- Easily download U.S. census maps☆33Updated last year
- An extremely fast FEC filing parser written in C☆74Updated 5 months ago
- Find your broken links, so users don't.☆66Updated 2 weeks ago
- A clever brute-force correlator for kinda-messy data☆82Updated last year
- Add website scraping abilities to Datasette☆62Updated last year
- ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.☆99Updated 2 years ago
- A Python implementation of Lunr.js 🌖☆196Updated last month
- A maximum-strength name parser for record linkage.☆36Updated last week
- A new python implementation of an old classic☆14Updated 2 years ago
- a python parser for the .fec file format☆45Updated last year
- Group thousands of similar spreadsheet or database text entries in seconds☆156Updated last year
- Opinionated template for Django projects on Python 3 and PostgreSQL☆24Updated 7 years ago
- Provide partial dates and retain the date precision through processing☆13Updated 2 years ago
- A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.☆24Updated last year
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- The data journalism platform with built in training☆304Updated 2 months ago
- Text and statistics utilities from Pew Research Center☆83Updated 3 years ago
- Python package for easy access to EveryPolitician data☆36Updated 8 years ago
- Scripts to make specific datasets cleaner and more convenient☆40Updated 2 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 5 years ago
- Extract networks of entities from journalistic reporting☆48Updated last year
- Datasette of earning call transcripts from the Motley Fool☆15Updated last year