public-law / open-gov-crawlersLinks
Parse government documents into well formed JSON
☆74Updated 3 months ago
Alternatives and similar repositories for open-gov-crawlers
Users that are interested in open-gov-crawlers are comparing it to the libraries listed below
Sorting:
- A database of court reporters, tests and other experiments☆117Updated last week
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆62Updated last week
- A helper library full of URL-related heuristics.☆73Updated last month
- Reading legal authority for the last time☆40Updated 8 months ago
- Scrape various open data directories to create an index of what's available out there☆37Updated 9 months ago
- Web scraping Page Objects core library☆102Updated 3 weeks ago
- Datasette plugin to create interactive dashboards☆157Updated this week
- Now included in rigour☆153Updated 2 months ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆111Updated last week
- A maximum-strength name parser for record linkage.☆39Updated 2 months ago
- Add website scraping abilities to Datasette☆65Updated 2 years ago
- Scrape HN to track links from specific domains☆67Updated this week
- Software stack with latest Scrapy and updated deps☆65Updated 3 months ago
- Scrapy rotation proxy package with advanced functions☆95Updated 3 years ago
- A modern Python library for writing maintainable web scrapers.☆247Updated 2 weeks ago
- Create a SQLite database containing data from your Pocket account☆107Updated 2 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- Save an RSS or ATOM feed to a SQLite database☆57Updated 3 weeks ago
- Scrapy middleware which allows to crawl only new content☆79Updated 3 years ago
- Spider templates for automatic crawlers.☆32Updated last month
- A collection of regular expressions for matching citations to state, federal, and even international law☆40Updated 4 years ago
- An open-source archive that gathers, saves, shares and analyzes news homepages☆147Updated 2 weeks ago
- Extract text from HTML☆134Updated 5 years ago
- Command-line tool for fetching JSON from paginated APIs☆68Updated last year
- Utilize your personal data like Google!☆160Updated 2 years ago
- Page Object pattern for Scrapy☆123Updated last month
- A financial disclosure data extraction tool.☆18Updated 2 years ago
- World legal info: scraped, organized, and permissively licensed under Creative Commons.☆20Updated 6 months ago
- Save data from Google Takeout to a SQLite database☆115Updated 2 years ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated 2 weeks ago