alephdata / memorious
Lightweight web scraping toolkit for documents and structured data.
☆311Updated last year
Alternatives and similar repositories for memorious:
Users that are interested in memorious are comparing it to the libraries listed below
- Data model and processing tools for investigative entity data☆228Updated this week
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆151Updated 3 months ago
- The data journalism platform with built in training☆305Updated 5 months ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆210Updated last week
- A cross-platform command line tool for parallelised content extraction and analysis.☆245Updated last week
- Extract networks of entities from journalistic reporting☆48Updated last year
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆62Updated this week
- A toolkit for mapping networks of political and economic influence through diverse types of entities and their relations. Accessible at h…☆188Updated 4 years ago
- An open database of international sanctions data, persons of interest and politically exposed persons☆554Updated this week
- Websites crawler with built-in exploration and control web interface☆350Updated this week
- CoCrawler is a versatile web crawler built using modern tools and concurrency.☆190Updated 3 years ago
- Utility library to turn country names into ISO two-letter codes☆66Updated 2 months ago
- A tiny library for Python text normalisation. Useful for ad-hoc text processing.☆151Updated 3 months ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆81Updated this week
- Twitter stream + search API grabber☆104Updated last year
- Information extraction and interactive visualization of textual datasets for investigative data-driven journalism and eDiscovery☆57Updated 9 months ago
- Social Feed Manager user interface application.☆155Updated 10 months ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated 6 months ago
- Python library for reading and writing tabular data via streams.☆237Updated 3 years ago
- Platform for journalists to search, analyse, categorise and share unstructured data☆55Updated 2 weeks ago
- Backend for the search engine service in Liquid Investigations.☆20Updated 7 months ago
- Data validation as a service. Project retired, got to the current one at frictionsless/repository☆69Updated 2 years ago
- A helper library full of URL-related heuristics.☆69Updated last month
- Easily crowdsource the analysis of your documents☆102Updated 7 years ago
- JavaScript app for displaying annotated network graphs based on data from LittleSis☆102Updated 2 months ago
- searching large heterogenous data dumps with Universal Sentence Encoder☆62Updated 3 years ago
- A modern Python library for writing maintainable web scrapers.☆250Updated 9 months ago
- a python library for parsing unstructured western names into name components.☆606Updated 6 months ago
- Collaborative data collection tool developed by the Associated Press☆109Updated 2 years ago
- Tools for generating CSV and other flat versions of the structured data☆107Updated this week