tfmorris / Names
A comprehensive database of name variants
☆46Updated 2 years ago
Alternatives and similar repositories for Names:
Users that are interested in Names are comparing it to the libraries listed below
- A set of workflows for corpus building through OCR, post-correction and normalisation☆48Updated 2 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆44Updated 7 years ago
- Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.☆148Updated 3 weeks ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆51Updated 4 years ago
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆48Updated 7 years ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- Events and Situations Ontology☆13Updated 6 years ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆204Updated this week
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 7 years ago
- KnowledgeStore☆20Updated 7 years ago
- Pikes is a Knowledge Extraction Suite☆23Updated last year
- Advanced desktop search/corpus exploration prototype☆21Updated 3 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆46Updated 3 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 7 years ago
- ☆21Updated 6 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- The linked open dataset described at http://datahub.io/dataset/vu-wordnet, and the tools used to create it☆25Updated 4 years ago
- Loading OpenSanctions into Neo4J and Linkurious☆28Updated 2 months ago
- 🚀GUI for training spaCy models☆54Updated 3 years ago
- A CSV file with US given names (first name) and their associated nicknames or diminutive names.☆294Updated 3 months ago
- Record Linkage ToolKit (Find and link entities)☆108Updated last year
- Wikidata authority file mapping tool☆11Updated 6 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆60Updated 7 years ago
- A tool for calculation semantic similarity between words from a text corpus based on lexico-syntactic patterns.☆28Updated 9 years ago
- 🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec☆60Updated 3 years ago
- Homebase of the IPTC EXTRA project about rule-based text categorization☆13Updated 7 years ago
- Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends☆55Updated last year
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago
- API implementation, User Interface, and more modules of the IPTC EXTRA project☆12Updated 3 years ago