solvenium / names-datasetLinks
A dataset of multinational first names and last names
☆27Updated 2 years ago
Alternatives and similar repositories for names-dataset
Users that are interested in names-dataset are comparing it to the libraries listed below
Sorting:
- Record Linkage ToolKit (Find and link entities)☆111Updated 2 years ago
- A helper library full of URL-related heuristics.☆73Updated 4 months ago
- Now included in rigour☆152Updated 2 months ago
- This page is a companion for the paper titled Towards Automatic Structuring and Semantic Indexing of Legal Documents☆29Updated 2 months ago
- API client for Aleph, supports bulk entity and document upload.☆29Updated last year
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆66Updated 2 weeks ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆144Updated 2 months ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆121Updated this week
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 3 years ago
- Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of e…☆196Updated 3 years ago
- Trying to generate name synonyms from wikidata☆34Updated 5 years ago
- Python based Wikidata framework for easy dataframe extraction☆45Updated 2 years ago
- A comprehensive database of name variants☆48Updated 3 years ago
- Index Common Crawl archives in tabular format☆124Updated last month
- This repository provides various Python methods for finding and aggregating synonyms for an individual word or a list of words.☆36Updated 2 years ago
- A machine learning tool for fishing entities☆270Updated 8 months ago
- Python wrapper library for the Datamuse API☆82Updated 2 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆197Updated last week
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆34Updated 2 years ago
- Extract dates from text☆66Updated 5 years ago
- Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf sc…☆41Updated 8 years ago
- Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.☆65Updated last month
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆58Updated 2 years ago
- A fuzzy matching & clustering library for python.☆26Updated 6 months ago
- ☆79Updated 7 years ago
- Python/Django based webapps and web user interfaces for search, structure (meta data management like thesaurus, ontologies, annotations a…☆99Updated 3 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆48Updated 2 years ago
- Browser version of Hyphe (WIP)☆32Updated 8 months ago