public-law / open-gov-crawlersLinks
Parse government documents into well formed JSON
☆74Updated this week
Alternatives and similar repositories for open-gov-crawlers
Users that are interested in open-gov-crawlers are comparing it to the libraries listed below
Sorting:
- A helper library full of URL-related heuristics.☆72Updated 2 months ago
- Web scraping Page Objects core library☆103Updated 2 weeks ago
- Software stack with latest Scrapy and updated deps☆65Updated 4 months ago
- Now included in rigour☆152Updated 2 weeks ago
- Web grep: search all rendered resources used by a URI☆89Updated 2 weeks ago
- Save data from Google Takeout to a SQLite database☆117Updated 2 years ago
- Extract text from HTML☆135Updated 5 years ago
- Page Object pattern for Scrapy☆124Updated last month
- Python clients for Zyte AutoExtract API☆41Updated 3 years ago
- Zyte Automatic Extraction integration for Scrapy☆56Updated 3 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆115Updated this week
- Add website scraping abilities to Datasette☆66Updated 2 years ago
- Spider templates for automatic crawlers.☆32Updated 2 months ago
- Scrapy middleware which allows to crawl only new content☆79Updated 3 years ago
- A database of court reporters, tests and other experiments☆117Updated 2 weeks ago
- Scrape various open data directories to create an index of what's available out there☆37Updated 9 months ago
- World legal info: scraped, organized, and permissively licensed under Creative Commons.☆20Updated 6 months ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated 2 years ago
- Scrapy rotation proxy package with advanced functions☆95Updated 3 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆63Updated this week
- A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service☆188Updated last year
- Save an RSS or ATOM feed to a SQLite database☆57Updated last month
- Create a SQLite database containing data from your Pocket account☆107Updated 2 years ago
- A Python client for the People Data Labs API☆36Updated this week
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆191Updated 3 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆142Updated last month
- A modern Python library for writing maintainable web scrapers.☆248Updated 2 weeks ago
- Common interface for data container classes☆68Updated last month
- python functions for applied use of schema.org☆36Updated 4 years ago
- Reading legal authority for the last time☆41Updated 9 months ago