alephdata / memorious
Lightweight web scraping toolkit for documents and structured data.
☆309Updated 10 months ago
Related projects ⓘ
Alternatives and complementary repositories for memorious
- Data model and processing tools for investigative entity data☆218Updated last week
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆145Updated 9 months ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆200Updated this week
- The data journalism platform with built in training☆306Updated last year
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆71Updated this week
- An open database of international sanctions data, persons of interest and politically exposed persons☆500Updated this week
- Extract networks of entities from journalistic reporting☆47Updated last year
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆59Updated last week
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆187Updated 2 years ago
- A self-hosted search engine for documents.☆598Updated this week
- Websites crawler with built-in exploration and control web interface☆328Updated 2 months ago
- An automated, programming-free web scraper for interactive sites☆107Updated last year
- A toolkit for mapping networks of political and economic influence through diverse types of entities and their relations. Accessible at h…☆187Updated 3 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated last month
- A modern Python library for writing maintainable web scrapers.☆244Updated 4 months ago
- Social Feed Manager user interface application.☆153Updated 4 months ago
- framework for scraping legislative/government data☆85Updated 2 months ago
- Search and browse documents and data; find the people and companies you look for.☆2,036Updated this week
- Monitor stories from news outlets for words or phrases that matter to you☆138Updated 2 months ago
- searching large heterogenous data dumps with Universal Sentence Encoder☆62Updated 3 years ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆144Updated 10 months ago
- 🔎 Finds fuzzy matches between CSV files☆184Updated 7 months ago
- Platform for journalists to search, analyse, categorise and share unstructured data☆53Updated last week
- Collaborative data collection tool developed by the Associated Press☆107Updated last year
- Python library for automating the administration of Google Alerts.☆98Updated last year
- A cross-platform command line tool for parallelised content extraction and analysis.☆241Updated 2 months ago
- A helper library full of URL-related heuristics.☆64Updated last month
- JavaScript app for displaying annotated network graphs from the LittleSis API and other data sources☆39Updated 6 years ago
- Computer-Assisted Reporting and Data Journalism Syllabuses, compiled by Dan Nguyen☆177Updated 3 years ago
- Media Cloud is an open source, open data platform that allows researchers to answer quantitative questions about the content of online me…☆281Updated last year