tfmorris / NamesLinks
A comprehensive database of name variants
☆47Updated 3 years ago
Alternatives and similar repositories for Names
Users that are interested in Names are comparing it to the libraries listed below
Sorting:
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Named-Entity Recognition extension for Google Refine / OpenRefine☆72Updated 8 years ago
- Google Refine extension for adding columns (extending data) from DBpedia☆39Updated 11 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 2 years ago
- Framework and command-line tools for integrating FollowTheMoney data streams from multiple sources☆212Updated this week
- All that entity matching, resolution, normalization, enhancement and reconciliation madness, but with a focus on data, not platforms.☆24Updated 3 years ago
- Metadata and per-statute PDFs for the U.S. Statutes at Large through volume 64 (1789-1951).☆17Updated 5 years ago
- An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)☆25Updated 7 years ago
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago
- code to remove "noise" from hOCR output of Tesseract OCR.☆14Updated 8 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆42Updated 10 months ago
- This repository contains the Domain Discovery Tool (DDT) project. DDT is an interactive system that helps users explore and better unders…☆45Updated 3 years ago
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Updated 2 years ago
- Now included in rigour☆151Updated last month
- Common web archive utility code.☆55Updated last month
- Advanced desktop search/corpus exploration prototype☆21Updated 4 years ago
- A tool for calculation semantic similarity between words from a text corpus based on lexico-syntactic patterns.☆27Updated 9 years ago
- Record Linkage ToolKit (Find and link entities)☆110Updated last year
- Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date☆41Updated 4 years ago
- Events and Situations Ontology☆14Updated 7 years ago
- ☆71Updated 5 months ago
- ☆21Updated 7 years ago
- Json Wikipedia, contains code to convert the Wikipedia xml dump into a json dump. Questions? https://gitter.im/idio-opensource/Lobby☆17Updated 3 years ago
- Ontologies of Linguistic Annotation. Machine-readable tagsets and annotation schemata for more than 100 languages.☆20Updated last month
- Topic Modeling Workflow in Python☆16Updated 2 years ago
- Parser and standardizer for politician, individual and organization names.☆129Updated 8 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆64Updated last year
- A framework to allow the matching of string entities using customised sets of transformations and matchers, plus a tool to produce the ne…☆33Updated 8 years ago
- Data Mining Historical Newspaper Metadata (METS/ALTO formats)☆25Updated 2 years ago