tfmorris / Names
A comprehensive database of name variants
☆44Updated 2 years ago
Alternatives and similar repositories for Names:
Users that are interested in Names are comparing it to the libraries listed below
- Homebase of the IPTC EXTRA project about rule-based text categorization☆13Updated 7 years ago
- A CSV file with US given names (first name) and their associated nicknames or diminutive names.☆294Updated 2 months ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 7 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆12Updated 4 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆50Updated 4 years ago
- 🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec☆60Updated 3 years ago
- Web-based synthesis of nifty NLP and entity extraction services☆13Updated 5 years ago
- ☆21Updated 9 months ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆202Updated this week
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago
- Semanticizest: dump parser and client☆20Updated 8 years ago
- ☆21Updated 6 years ago
- ☆13Updated 7 years ago
- Home of the IPTC NewsML-G2 standard for the exchange of news and news-releated information☆15Updated 2 months ago
- Source real estate prices from the Common Crawl.☆27Updated 6 years ago
- Advanced desktop search/corpus exploration prototype☆21Updated 3 years ago
- Trying to generate name synonyms from wikidata☆33Updated 4 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆43Updated 7 years ago
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 2 years ago
- A simple OpenRefine reconciliation service that runs on top of a CSV file☆119Updated 9 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- Wikidata authority file mapping tool☆11Updated 6 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- Wayward is a Python package that helps to identify characteristic terms from single documents or groups of documents. It can be used for …☆9Updated 5 years ago
- ☆24Updated 9 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- extensible Web Retrieval Toolkit☆17Updated 2 years ago
- A maximum-strength name parser for record linkage.☆36Updated 5 months ago