AlonEirew / wikipedia-to-elastic
Analyze and extract Wikipedia article text and attributes and store them into an ElasticSearch index or to json files (multilingual support)
☆47Updated last year
Alternatives and similar repositories for wikipedia-to-elastic:
Users that are interested in wikipedia-to-elastic are comparing it to the libraries listed below
- Automatically exported from code.google.com/p/wiki-links☆42Updated 9 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 4 years ago
- Wikidata embedding☆50Updated 6 months ago
- A Named-Entity Recogniser based on Grobid.☆52Updated 7 months ago
- Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Sear…☆85Updated 3 years ago
- Code accompanying our paper "One Knowledge Graph to Rule them All? Analyzing the Differences between DBpedia, YAGO, Wikidata & co."☆26Updated 7 years ago
- A thin wrapper around the DBpedia Spotlight HTTP API☆25Updated 7 years ago
- This repository includes all the code and data for the paper ELiDi (End2end Entity Linking and Disambiguation)☆14Updated 3 years ago
- Simple Wikipedia plain text extractor with article link annotations and Hadoop support.☆103Updated 14 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆112Updated 3 months ago
- Extracting useful metadata from Wikipedia dumps in any language.☆26Updated 5 years ago
- Linking Entities in CommonCrawl Dataset onto Wikipedia Concepts☆59Updated 12 years ago
- Meta-repository for the open-source version of the SUMMA Platform☆15Updated last year
- A temporal ordering system for events and time expressions in written text.☆43Updated 3 years ago
- Filter and format a newline-delimited JSON stream of Wikibase entities☆97Updated 6 months ago
- An open information extraction system that provides compact extractions☆91Updated 3 years ago
- A web application tagging and retrieval of arguments in text☆28Updated 2 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtities☆115Updated 2 years ago
- CrowdTruth framework for crowdsourcing ground truth for training & evaluation of AI systems☆60Updated last year
- TeXoo – A Zoo of Text Extractors☆18Updated 4 years ago
- spaCy pipeline component for generating spaCy KnowledgeBase Alias Candidates for Entity Linking☆85Updated 2 years ago
- This is an implementation of Hearst patterns, for finding hyponyms, written in Python.☆87Updated 2 years ago
- Record Linkage ToolKit (Find and link entities)☆110Updated last year
- Keras implementation of ontology aware token embeddings☆48Updated 6 years ago
- D3 and Play based visualization for entity-relation graphs, especially for NLP and information extraction☆30Updated 9 years ago
- Entity Linking for the masses☆56Updated 9 years ago
- The SMAPH system for query entity linking.☆20Updated 6 years ago
- Extracting narrative timelines (i.e. order and timing of events) from text☆20Updated 6 years ago
- Provides web credibility models (Likert scale) to assign a trustworthiness score to a given website.☆11Updated 5 years ago
- Python toolkit for ranking experiments on sentence/summary data☆24Updated 2 years ago