sigpwned / popular-names-by-country-datasetLinks
A dataset of popular forenames and surnames by country
☆47Updated 2 years ago
Alternatives and similar repositories for popular-names-by-country-dataset
Users that are interested in popular-names-by-country-dataset are comparing it to the libraries listed below
Sorting:
- Estimating the age of web resources☆96Updated 4 months ago
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.☆94Updated 4 months ago
- Now included in rigour☆152Updated last month
- 🤬 Map of profane words to a rating of sureness☆256Updated 2 years ago
- A helper library full of URL-related heuristics.☆73Updated 3 weeks ago
- Example scripts for the pushshift dump files☆407Updated 2 months ago
- Websites crawler with built-in exploration and control web interface☆360Updated last month
- Index Common Crawl archives in tabular format☆122Updated 2 months ago
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆221Updated 2 years ago
- ChatGPT with access to the internet☆26Updated 2 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆141Updated 2 months ago
- Extract networks of entities from journalistic reporting☆48Updated 2 years ago
- A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.☆118Updated last year
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆41Updated 6 years ago
- A database of court reporters, tests and other experiments☆116Updated last week
- Ultimate Website Sitemap Parser☆225Updated last month
- Making Reddit data accessible to researchers, moderators and everyone else. Interact with the data through large dumps, an API or web in…☆597Updated last week
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆185Updated this week
- Offline database of synonyms/thesaurus☆203Updated last year
- Tracking the far right on Twitter☆63Updated 2 years ago
- etl pipeline, graphical explorer and general toolbox for investigations with follow the money data☆23Updated 3 months ago
- Download subreddit comments☆97Updated 3 years ago
- The Java Graphical Authorship Attribution Program☆277Updated last year
- A webmining CLI tool & library for python.☆339Updated last week
- A narrative-focused agent-based settlement simulation framework.☆68Updated 3 months ago
- A collection of tools for archiving and analysing the internet.☆78Updated 3 years ago
- Scrapers for U.S. county court sites.☆73Updated 2 years ago
- Tool and library for handling Web ARChive (WARC) files.☆164Updated last year
- Find legal citations in any block of text☆176Updated 2 weeks ago
- The Edinburgh Associative Thesaurus (EAT) is a set of word association norms showing the counts of word association as collected from sub…☆46Updated 3 years ago