solvenium / names-datasetLinks
A dataset of multinational first names and last names
☆27Updated 2 years ago
Alternatives and similar repositories for names-dataset
Users that are interested in names-dataset are comparing it to the libraries listed below
Sorting:
- A helper library full of URL-related heuristics.☆73Updated 2 months ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- API client for Aleph, supports bulk entity and document upload.☆28Updated last year
- Now included in rigour☆152Updated 2 weeks ago
- A curated list of promising Web Data Extractors resources☆29Updated 5 years ago
- This repository contains an implementation of a US address parser built using spaCy NLP library.☆38Updated 2 years ago
- Extract dates from text☆66Updated 4 years ago
- NameKrea is an AI Domain Name Generator which uses GPT-2☆49Updated 3 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆142Updated last month
- Trying to generate name synonyms from wikidata☆34Updated 5 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆189Updated 3 weeks ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆116Updated this week
- ☆80Updated 7 years ago
- This repository provides various Python methods for finding and aggregating synonyms for an individual word or a list of words.☆35Updated 2 years ago
- Index Common Crawl archives in tabular format☆124Updated last week
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆63Updated this week
- Python based Wikidata framework for easy dataframe extraction☆45Updated 2 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆47Updated 2 years ago
- Tools to construct and process Common Crawl webgraphs☆102Updated this week
- Download subreddit comments☆96Updated 3 years ago
- Advanced news feeds extractor and finder library. Helps to automatically extract news from websites without RSS/ATOM feeds☆81Updated 2 weeks ago
- An email segmentation system (reference implementation of ECIR 2018 paper)☆10Updated 6 years ago
- A comprehensive database of name variants☆47Updated 3 years ago
- Record Linkage ToolKit (Find and link entities)☆111Updated 2 years ago
- A fast python implementation of the SimHash algorithm.☆27Updated 4 years ago
- Boolean text search in Python☆46Updated 5 months ago
- An automated, programming-free web scraper for interactive sites☆111Updated 2 years ago
- Extract networks of entities from journalistic reporting☆49Updated 2 years ago
- Resolve the `location` string in Twitter users' profiles to US states (and cities)☆19Updated 9 years ago