sigpwned / popular-names-by-country-datasetLinks
A dataset of popular forenames and surnames by country
β40Updated 2 years ago
Alternatives and similar repositories for popular-names-by-country-dataset
Users that are interested in popular-names-by-country-dataset are comparing it to the libraries listed below
Sorting:
- π¦ A list, huge one (~200K) of human male/female first/last names.β54Updated last year
- Example scripts for the pushshift dump filesβ386Updated 2 weeks ago
- Estimating the age of web resourcesβ96Updated 2 months ago
- Tracking the far right on Twitterβ62Updated last year
- Now included in rigourβ151Updated last week
- Download subreddit commentsβ96Updated 3 years ago
- A webmining CLI tool & library for python.β333Updated 2 months ago
- A dataset of multinational first names and last namesβ26Updated 2 years ago
- Text databases of last names from various countriesβ280Updated 2 years ago
- Fast and robust date extraction from web pages, with Python or on the command-lineβ136Updated last week
- These tweets display several bad actors' most divisive uses of the Twitter platform.β49Updated 2 years ago
- Index Common Crawl archives in tabular formatβ123Updated last week
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)β164Updated last month
- A python package providing functionality for matching words using different characters but appearing to be a similar/the same word.β62Updated last year
- A simple Python wrapper and command-line interface for archive.orgβs "Save Page Now" capturing serviceβ180Updated 9 months ago
- A Python API to the Internet Archive Wayback Machineβ76Updated last year
- A helper library full of URL-related heuristics.β70Updated 2 months ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.β53Updated 4 years ago
- JSON file of all games available on Steam with prices and additional data from Steam Spy, GameFAQs, Metacritic, IGDB and HLTB.β92Updated 2 years ago
- An analysis of YouTube's political influence through recommendations.β155Updated 2 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machineβ178Updated 7 months ago
- Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.β131Updated 2 weeks ago
- β15Updated last year
- Platform for journalists to search, analyse, categorise and share unstructured dataβ55Updated this week
- The graphics portion of Dwarf Fortress.β50Updated this week
- Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.β93Updated last month
- A framework for quick web archiving; canonical repository: https://gitea.arpa.li/JustAnotherArchivist/qwarcβ29Updated 4 years ago
- Twitter Blue dataβ123Updated 2 years ago
- Scrape Twitter API without authentication using Nitter.β64Updated 2 years ago
- I wanted all of plaintext Project Gutenberg in an easy-to-use format, so I made thisβ222Updated 2 years ago