jamesturk / scrapelib
⛏ a library for scraping unreliable pages
☆210Updated 6 months ago
Alternatives and similar repositories for scrapelib:
Users that are interested in scrapelib are comparing it to the libraries listed below
- legacy backend for Open States☆87Updated 5 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- Unified Python bindings for Sunlight APIs☆66Updated 8 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆147Updated last month
- Python library with common functionality for writing web scrapers☆102Updated 9 years ago
- A modern Python library for writing maintainable web scrapers.☆245Updated 7 months ago
- A Python module for accessing the Open States API☆29Updated last year
- PyOpenGraph is a library written in Python for parsing Open Graph protocol information from web sites.☆94Updated 10 years ago
- framework for scraping legislative/government data☆85Updated 5 months ago
- Publish spreadsheets as interactive tables. And do it on deadline.☆74Updated 8 years ago
- A deprecated Python wrapper for the DocumentCloud API☆63Updated 4 years ago
- Opinionated template for Django projects on Python 3 and PostgreSQL☆24Updated 7 years ago
- A Flask-based static site authoring tool.☆165Updated 2 years ago
- Utility library to turn country names into ISO two-letter codes☆66Updated this week
- Django template filter that creates an "a" or "an" in front of your text based on it's phonetic value.☆25Updated 9 years ago
- Python library to extract text from PDF, and default to OCR when text extraction fails.☆61Updated 7 years ago
- Next-gen web application for public finance data warehouses, formerly OpenSpending☆57Updated 2 years ago
- ☆121Updated 11 years ago
- python library to the bitly api☆245Updated 3 years ago
- ScraperWiki Python library for scraping and saving data☆159Updated 2 years ago
- A repository of journalist's lookup tables.☆106Updated 7 years ago
- Open remote tables, be they CSV, XLSX, HTML, XML, ...☆35Updated 13 years ago
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆148Updated 3 weeks ago
- Python lib for Embedly☆81Updated 11 months ago
- PANDA: A Newsroom Data Appliance☆206Updated 2 years ago
- agate-sql adds SQL read/write support to agate.☆19Updated 2 weeks ago
- Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.☆107Updated 3 months ago
- pyaddress is an address parsing library, taking the guesswork out of using addresses in your applications. We use it as part of our apart…☆100Updated 5 years ago
- Library for guessing a person's gender by their first name.☆57Updated 7 years ago
- Code for the "Intro to Data Journalism with Python" Workshop☆74Updated 11 years ago