sigpwned / popular-names-by-country-datasetLinks
A dataset of popular forenames and surnames by country
☆48Updated 2 years ago
Alternatives and similar repositories for popular-names-by-country-dataset
Users that are interested in popular-names-by-country-dataset are comparing it to the libraries listed below
Sorting:
- Now included in rigour☆152Updated 2 months ago
- JSON file of all games available on Steam with prices and additional data from Steam Spy, GameFAQs, Metacritic, IGDB and HLTB.☆92Updated 2 years ago
- Offline database of synonyms/thesaurus☆204Updated last year
- The largest English-language thesaurus☆307Updated last month
- A simple script for using Google's Vision API that will possibly develop into an actual tool.☆13Updated 7 years ago
- track changes to the news, where news is anything with an RSS feed☆179Updated 5 years ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)☆167Updated 2 months ago
- Estimating the age of web resources☆96Updated 5 months ago
- ☆110Updated last month
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆23Updated 3 months ago
- Experimental proxy and wrapper for safely embedding Web Archives (warc, warc.gz, wacz) into web pages.☆38Updated 6 months ago
- A helper library full of URL-related heuristics.☆73Updated last month
- JavaScript module and CLI tool for working with web archive data using the WACZ format specification.☆16Updated 7 months ago
- A webmining CLI tool & library for python.☆340Updated last week
- Word lists from the web.☆92Updated 9 years ago
- A polite and user-friendly downloader for Common Crawl data☆57Updated 2 months ago
- Automated behaviors that run in browser to interact with complex sites automatically. Used by ArchiveWeb.page and Browsertrix Crawler.☆51Updated last week
- Documentation and project-wide issues for the Website Monitoring project (a.k.a. "Scanner")☆110Updated 3 weeks ago
- Ethical, legal, and effortless extraction of Reddit data in your database☆80Updated last week
- Diary for qualitative analysis☆28Updated 3 months ago
- Loghi is a comprehensive toolkit designed for Handwritten Text Recognition (HTR) and Optical Character Recognition (OCR), offering an acc…☆131Updated last month
- Generate search links for a many genealogy websites.☆23Updated 3 years ago
- ☆111Updated last year
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.☆95Updated 4 months ago
- A collection of tools for archiving and analysing the internet.☆78Updated 3 years ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆224Updated last week
- The Misinformation Game is a social-media simulator built to study how people interact with information on social-media.☆31Updated this week
- The Python library for names.☆956Updated 7 months ago
- A Python module to manipulate data on a Wikibase instance (like Wikidata) through the MediaWiki Wikibase API and the Wikibase SPARQL endp…☆80Updated 2 weeks ago
- Extract networks of entities from journalistic reporting☆48Updated 2 years ago