solvenium / names-datasetLinks
A dataset of multinational first names and last names
☆26Updated 2 years ago
Alternatives and similar repositories for names-dataset
Users that are interested in names-dataset are comparing it to the libraries listed below
Sorting:
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆47Updated 2 years ago
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 7 years ago
- Boolean text search in Python☆46Updated 2 months ago
- A fuzzy matching & clustering library for python.☆26Updated last month
- Fast and robust date extraction from web pages, with Python or on the command-line☆138Updated last month
- Record Linkage ToolKit (Find and link entities)☆110Updated 2 years ago
- This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around t…☆33Updated 2 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.☆96Updated this week
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆71Updated 2 years ago
- Trying to generate name synonyms from wikidata☆32Updated 5 years ago
- an experimental implementation of Burrow's delta in Python 3☆21Updated 3 years ago
- ☆81Updated 6 years ago
- A helper library full of URL-related heuristics.☆70Updated this week
- Python based Wikidata framework for easy dataframe extraction☆45Updated last year
- Unreliable News Index (for Columbia Journalism Review)☆56Updated 3 years ago
- Now included in rigour☆151Updated 3 weeks ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- A toolkit for CDX indices such as Common Crawl and the Internet Archive's Wayback Machine☆183Updated 8 months ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- Extract dates from text☆65Updated 4 years ago
- Index Common Crawl archives in tabular format☆122Updated last month
- Use ML-Annotate to label data for machine learning purposes☆111Updated 5 years ago
- This repository contains an implementation of a US address parser built using spaCy NLP library.☆38Updated 2 years ago
- Meta-repository for the open-source version of the SUMMA Platform☆16Updated last year
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆61Updated this week
- GraphiPy: Universal Social Data Extractor☆83Updated 2 years ago
- Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and processing large text collections with ML and for ML☆63Updated 7 months ago
- Extracting addresses from text☆42Updated 7 years ago
- Interpretable feature construction from taxonomies for text classification☆18Updated 3 years ago
- An email segmentation system (reference implementation of ECIR 2018 paper)☆10Updated 5 years ago