solvenium / names-dataset
A dataset of multinational first names and last names
☆25Updated last year
Related projects ⓘ
Alternatives and complementary repositories for names-dataset
- Record Linkage ToolKit (Find and link entities)☆106Updated last year
- Meta-repository for the open-source version of the SUMMA Platform☆16Updated 7 months ago
- TeXoo – A Zoo of Text Extractors☆18Updated 4 years ago
- Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual suppo…☆46Updated last year
- A helper library full of URL-related heuristics.☆64Updated last month
- 🚀GUI for training spaCy models☆53Updated 3 years ago
- Fast and robust date extraction from web pages, with Python or on the command-line☆122Updated last week
- Matrix-based News Aggregation to Explore Media Bias☆20Updated 6 years ago
- Topic Detection from English text using BERT + Bi-GRU + CRF☆14Updated 4 years ago
- An Email Segmentation System☆9Updated 4 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Aviation grade news article metadata extraction☆36Updated last year
- A comprehensive database of name variants☆44Updated 2 years ago
- Finds linguistic patterns effortlessly☆33Updated last year
- Interpretable feature construction from taxonomies for text classification☆18Updated 2 years ago
- In the wild extraction of entities that are found using Flair and displayed using a very elegant front-end.☆69Updated last year
- API client for Aleph, supports bulk entity and document upload.☆28Updated last month
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆42Updated 6 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆57Updated 9 months ago
- 🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec☆60Updated 3 years ago
- TAXI: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling☆29Updated last year
- Named entity recognition for the legal domain☆40Updated 3 years ago
- Abydos NLP/IR library for Python☆183Updated 2 years ago
- Tool to generate paraphrases of sentences in many languages.☆77Updated 2 years ago
- Use ML-Annotate to label data for machine learning purposes☆104Updated 4 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆25Updated 2 years ago
- NSS Capstone project to use natural language modeling, classification, and information extraction to get the exact employee count values …☆15Updated 6 years ago
- API - extract a list of keywords from a text.☆18Updated 7 years ago
- Loading OpenSanctions into Neo4J and Linkurious☆27Updated last month
- Integration between Reaction ECommerce and Accelerated Text to provide product descriptions for an e-shop.☆9Updated 3 years ago