YannBrrd / elasticsearch-entity-resolution
Elasticsearch entity resolution plugin based on Duke
☆210Updated 4 years ago
Alternatives and similar repositories for elasticsearch-entity-resolution:
Users that are interested in elasticsearch-entity-resolution are comparing it to the libraries listed below
- Additional opennlp mapping type for elasticsearch in order to perform named entity recognition☆136Updated 8 years ago
- Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.☆128Updated 9 years ago
- ☆111Updated 7 years ago
- Scalable query engine for web scrapping/data mashup/acceptance QA, powered by Apache Spark☆142Updated this week
- Duke is a fast and flexible deduplication engine written in Java☆618Updated last year
- ☆92Updated 9 years ago
- A platform for real-time streaming search☆103Updated 9 years ago
- Spark RDD with Lucene's query and entity linkage capabilities☆125Updated last month
- A java library for stored queries☆375Updated 2 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆92Updated 9 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.☆382Updated 2 years ago
- Graphify is a Neo4j unmanaged extension used for document and text classification using graph-based hierarchical pattern recognition.☆380Updated 4 years ago
- Carrot2 plugin for ElasticSearch☆292Updated 2 years ago
- This project combines Apache Spark and Elasticsearch to enable mining & prediction for Elasticsearch.☆210Updated 10 years ago
- A toolkit that wraps various natural language processing implementations behind a common interface.☆101Updated 7 years ago
- GraphAware Timer-Driven Runtime Module that executes PageRank-like algorithm on the graph☆26Updated 7 years ago
- MLeap allows for easily putting Spark ML pipelines into production☆78Updated 8 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆147Updated 9 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆281Updated 6 years ago
- Beyond Piwik Analytics with Scala and Apache Spark☆46Updated 10 years ago
- Elasticsearch Index Termlist☆117Updated 5 years ago
- A text tagger based on Lucene / Solr, using FST technology☆176Updated last year
- Chalk is a natural language processing library.☆259Updated 8 years ago
- Text classification using Naive Bayes and Elasticsearch☆154Updated 8 years ago
- Elasticsearch Latent Semantic Indexing experimentation☆33Updated 5 years ago
- Functional, Typesafe, Declarative Data Pipelines☆139Updated 7 years ago
- Juicer is a web API for extracting text, meta data and named entities from HTML "article" type pages.☆60Updated 9 years ago
- Locality Sensitive Hashing for Apache Spark☆195Updated 8 years ago
- Custom graph algorithms for Neo4j with own Java and REST APIs☆34Updated 8 years ago