jamesturk / scrapelibLinks
⛏ a library for scraping unreliable pages
☆211Updated last week
Alternatives and similar repositories for scrapelib
Users that are interested in scrapelib are comparing it to the libraries listed below
Sorting:
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆152Updated 5 months ago
- A modern Python library for writing maintainable web scrapers.☆249Updated last week
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 8 years ago
- legacy backend for Open States☆87Updated 5 years ago
- Unified Python bindings for Sunlight APIs☆66Updated 9 years ago
- Open remote tables, be they CSV, XLSX, HTML, XML, ...☆33Updated 13 years ago
- framework for scraping legislative/government data☆85Updated 9 months ago
- Opinionated template for Django projects on Python 3 and PostgreSQL☆24Updated 7 years ago
- Utility library to turn country names into ISO two-letter codes☆69Updated last week
- Utilities for working with data.☆20Updated 10 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated 2 years ago
- A deprecated Python wrapper for the DocumentCloud API☆62Updated 4 years ago
- A Python module for accessing the Open States API☆29Updated last year
- Archived Project - Please reference 3rd party forks listed in README☆54Updated 5 years ago
- PyOpenGraph is a library written in Python for parsing Open Graph protocol information from web sites.☆94Updated 11 years ago
- A Python library for finding feed links on websites.☆52Updated 3 years ago
- PANDA: A Newsroom Data Appliance☆205Updated 2 years ago
- Publish spreadsheets as interactive tables. And do it on deadline.☆74Updated 8 years ago
- A Flask-based static site authoring tool.☆165Updated 3 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆90Updated 3 years ago
- Ultra simple API for geocoding a single string against various web services.☆183Updated 11 years ago
- A python client for accessing the Google Analytics API☆248Updated 3 years ago
- Simple type converters: make ints, floats, bools and dates from your strings!☆11Updated 8 years ago
- Tutorial on transforming a complex, linear Python script into a modular, easier-to-maintain application.☆44Updated 6 years ago
- ProPublica's collaborative tip-gathering framework. Import and manage CSV, Google Sheets and Screendoor data with ease.☆100Updated 2 years ago
- Some simple math we use to do journalism.☆79Updated 8 years ago
- ArchiveKit manages data and documents during ETL processes, either on a local file system or on S3.☆15Updated 10 years ago
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆18Updated 2 years ago
- A repository of journalist's lookup tables.☆106Updated 8 years ago