sigpwned / popular-names-by-country-datasetLinks
A dataset of popular forenames and surnames by country
β42Updated 2 years ago
Alternatives and similar repositories for popular-names-by-country-dataset
Users that are interested in popular-names-by-country-dataset are comparing it to the libraries listed below
Sorting:
- π¦ A list, huge one (~200K) of human male/female first/last names.β54Updated last year
- JSON file of all games available on Steam with prices and additional data from Steam Spy, GameFAQs, Metacritic, IGDB and HLTB.β92Updated 2 years ago
- Python wrapper for the MediaWiki API to access and parse data from Wikipediaβ42Updated 3 weeks ago
- Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)β164Updated last month
- π€¬ Map of profane words to a rating of surenessβ256Updated 2 years ago
- Now included in rigourβ151Updated 2 weeks ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.β53Updated 4 years ago
- track changes to the news, where news is anything with an RSS feedβ179Updated 5 years ago
- Dataset: BuzzFeed News βTrendingβ Strip, 2018β2023β19Updated 2 years ago
- A Python API to the Internet Archive Wayback Machineβ78Updated last week
- Korpuslinguistik war noch nie so einfach...β24Updated 2 months ago
- The Python library for names.β945Updated 5 months ago
- Tool and library for handling Web ARChive (WARC) files.β164Updated 11 months ago
- API client for Aleph, supports bulk entity and document upload.β28Updated 11 months ago
- Please note that the warc-indexer tool & code is now supported by NetArchiveSuite. The 'warc-indexer' directory and code that exists in tβ¦β128Updated 2 months ago
- Centralised repository for WARC usage specifications.β117Updated 10 months ago
- The scraper/parser that produces data for TheyWorkForYou, PublicWhip, etcβ66Updated this week
- A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.β458Updated last year
- Curated lists of credible and non-credible online sources, available for public useβ91Updated 7 years ago
- β107Updated last month
- Command-line tool and Rust library for handling Web ARChive (WARC) filesβ25Updated 3 months ago
- Repository for developing collaboratively the video game ontologyβ16Updated 9 years ago
- A structured dataset of emails sent at Atari from 1983 to 1992.β17Updated 3 years ago
- A simple Python wrapper and command-line interface for archive.orgβs "Save Page Now" capturing serviceβ184Updated 11 months ago
- A collection of computer tools for aiding the text critical workflow from transcription to collation to analysis.β23Updated 5 months ago
- A webmining CLI tool & library for python.β336Updated this week
- A helper library full of URL-related heuristics.β70Updated this week
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sourcesβ221Updated this week
- A CSV file with US given names (first name) and their associated nicknames or diminutive names.β305Updated last month
- Estimating the age of web resourcesβ96Updated 3 months ago