alephdata/memorious

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alephdata/memorious)

alephdata / memorious

Lightweight web scraping toolkit for documents and structured data.

☆315

Alternatives and similar repositories for memorious

Users that are interested in memorious are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alephdata / followthemoney
View on GitHub
Data model and processing tools for investigative entity data
☆278Feb 28, 2026Updated 4 months ago
occrp-attic / datacommons
View on GitHub
A fleet of Memorious scrapers for crawling various open data sources
☆15Sep 24, 2020Updated 5 years ago
alephdata / aleph
View on GitHub
Search and browse documents and data; find the people and companies you look for.
☆2,398Feb 20, 2026Updated 5 months ago
alephdata / datadesktop
View on GitHub
DEPRECATED. Desktop graph visualization application
☆51Sep 30, 2022Updated 3 years ago
opensanctions / opensanctions
View on GitHub
An open database of international sanctions data, persons of interest and politically exposed persons
☆772Updated this week
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
alephdata / alephclient
View on GitHub
API client for Aleph, supports bulk entity and document upload.
☆30Mar 5, 2026Updated 4 months ago
dataresearchcenter / investigraph
View on GitHub
etl pipeline, graphical explorer and general toolbox for investigations with follow the money data
☆28Jul 15, 2025Updated last year
alephdata / ingest-file
View on GitHub
Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
☆67Dec 19, 2025Updated 7 months ago
datamade / dossier
View on GitHub
Machine assisted dossiers
☆19Oct 12, 2017Updated 8 years ago
opensanctions / fingerprints
View on GitHub
Now included in rigour
☆150Nov 24, 2025Updated 7 months ago
influencemapping / oligrapher2
View on GitHub
A re-useable, stand-alone version of LittleSis network storytelling tool
☆12Jan 30, 2016Updated 10 years ago
opensanctions / qarin
View on GitHub
How can we improve name matching in screening tools?
☆17Aug 13, 2025Updated 11 months ago
opensanctions / offshore-graph
View on GitHub
Loading OpenSanctions into Neo4J and Linkurious
☆31Dec 17, 2024Updated last year
alephdata / synonames
View on GitHub
Trying to generate name synonyms from wikidata
☆35Jun 28, 2020Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
opensanctions / nomenklatura
View on GitHub
Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources
☆257Updated this week
sunlightlabs / photosynthesis
View on GitHub
Official repo documenting the closure of Sunlight Labs
☆11Sep 28, 2016Updated 9 years ago
opensanctions / storyweb
View on GitHub
Extract networks of entities from journalistic reporting
☆49Jul 17, 2023Updated 3 years ago
alephdata / react-ftm
View on GitHub
React UI component library for aleph/followthemoney
☆12Nov 22, 2022Updated 3 years ago
pudo / prefixdate
View on GitHub
Provide partial dates and retain the date precision through processing
☆14Aug 4, 2025Updated 11 months ago
pudo-attic / addressformatting
View on GitHub
International Address formatter which considers the standard formatting rules of the country
☆14Nov 21, 2024Updated last year
alephdata / pdflib
View on GitHub
Binary Python bindings for poppler utils for content extraction
☆42May 12, 2021Updated 5 years ago
pudo / normality
View on GitHub
A tiny library for Python text normalisation. Useful for ad-hoc text processing.
☆158Mar 8, 2026Updated 4 months ago
openspending / spendb
View on GitHub
Next-gen web application for public finance data warehouses, formerly OpenSpending
☆57Jul 6, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
opensanctions / countrynames
View on GitHub
Utility library to turn country names into ISO two-letter codes
☆72Updated this week
demodiff / berlin
View on GitHub
Versammlungen in Berlin: Konservieren historischer Daten.
☆17Updated this week
guardian / giant
View on GitHub
Platform for journalists to search, analyse, categorise and share unstructured data
☆59Updated this week
pudo / banal
View on GitHub
Commons of stupid, simple Python micro functions. Pull requests very welcome.
☆21Jun 20, 2026Updated last month
pudo-attic / jsonmapping
View on GitHub
Transform flat data structures into nested object graphs matching JSON schema definitions.
☆28Aug 9, 2016Updated 9 years ago
occrp / COVID-19-spending-2020
View on GitHub
OCCRP and media partners collected data on COVID-19 related spending from across Europe from February to October 2020
☆14Nov 26, 2020Updated 5 years ago
bomquote / transistor
View on GitHub
Transistor, a Python web scraping framework for intelligent use cases.
☆211Jan 29, 2026Updated 5 months ago
opensanctions / rigour
View on GitHub
Data cleaning and validation functions for names, languages, identifiers, etc.
☆64Updated this week
allisson / python-preparer
View on GitHub
Simple way to build a new dict based on fields declaration
☆15May 7, 2019Updated 7 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ICIJ / extract
View on GitHub
A cross-platform command line tool for parallelised content extraction and analysis.
☆256Updated this week
theSoenke / news-crawler
View on GitHub
Crawler that collects and extracts content of daily published news articles
☆14Feb 18, 2023Updated 3 years ago
opensanctions / poliloom
View on GitHub
Help build the world's largest open database of politicians.
☆18Jun 8, 2026Updated last month
opentrials / api
View on GitHub
The OpenTrials API service + database schema definition.
☆12Nov 18, 2018Updated 7 years ago
openownership / bodsdata
View on GitHub
Data analysis tools to help analysts, journalists and anyone wanting to examine and dive into beneficial ownership data published in line…
☆16Sep 5, 2025Updated 10 months ago
crowdata / crowdata
View on GitHub
Easily crowdsource the analysis of your documents
☆102Nov 7, 2017Updated 8 years ago
pudo / typecast
View on GitHub
Simple type converters: make ints, floats, bools and dates from your strings!
☆11Jul 23, 2016Updated 9 years ago