jamesturk / scrapelibLinks
⛏ a library for scraping unreliable pages
☆211Updated this week
Alternatives and similar repositories for scrapelib
Users that are interested in scrapelib are comparing it to the libraries listed below
Sorting:
- legacy backend for Open States☆87Updated 5 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆157Updated 3 months ago
- Python library with common functionality for writing web scrapers☆102Updated 10 years ago
- A modern Python library for writing maintainable web scrapers.☆248Updated last month
- framework for scraping legislative/government data☆89Updated last month
- Parser and standardizer for politician, individual and organization names.☆129Updated 8 years ago
- Easy extraction of keywords and engines from search engine results pages (SERPs).☆93Updated 2 months ago
- Opinionated template for Django projects on Python 3 and PostgreSQL☆24Updated 8 years ago
- Tools for parsing messy tabular data. This is now superseded by https://github.com/frictionlessdata/tabulator-py☆392Updated 2 years ago
- Now included in rigour☆152Updated last month
- Scrapes sites. Gets news. Eventually events.☆85Updated 9 years ago
- python library for extracting html microdata☆166Updated 2 years ago
- Next-gen web application for public finance data warehouses, formerly OpenSpending☆57Updated 3 years ago
- Tools for generating CSV and other flat versions of the structured data☆109Updated last week
- remove signature blocks from emails☆86Updated 6 years ago
- Ultra simple API for geocoding a single string against various web services.☆183Updated 12 years ago
- A Python library for extracting titles, images, descriptions and canonical urls from HTML.☆151Updated 5 years ago
- geonamescache - a Python library for quick access to a subset of GeoNames data.☆120Updated 3 months ago
- A Python module for accessing the Open States API☆31Updated 2 years ago
- Utility library to turn country names into ISO two-letter codes☆71Updated 4 months ago
- Parse, normalize and render postal addresses.☆184Updated 2 years ago
- PANDA: A Newsroom Data Appliance☆207Updated 3 years ago
- Unified Python bindings for Sunlight APIs☆66Updated 9 years ago
- Scrapy middleware which allows to crawl only new content☆79Updated last week
- A toolkit for mapping networks of political and economic influence through diverse types of entities and their relations. Accessible at h…☆192Updated 4 years ago
- A Flask-based static site authoring tool.☆164Updated 3 years ago
- Simple type converters: make ints, floats, bools and dates from your strings!☆11Updated 9 years ago
- ScraperWiki Python library for scraping and saving data; in maintenance mode☆158Updated last week
- Street address parser and formatter☆91Updated 6 years ago
- A tool to allow US addresses to be geocoded/georeferenced easily, without using Python or the command line or paid services or anything.☆18Updated 3 years ago