larsga / DukeLinks
Duke is a fast and flexible deduplication engine written in Java
☆623Updated last year
Alternatives and similar repositories for Duke
Users that are interested in Duke are comparing it to the libraries listed below
Sorting:
- Elasticsearch entity resolution plugin based on Duke☆209Updated 5 years ago
- A java library for stored queries☆376Updated 2 years ago
- Data Integration Graph☆206Updated 6 years ago
- Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.☆381Updated 2 years ago
- Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.☆282Updated 7 years ago
- Banana for Solr - A Port of Kibana☆671Updated 11 months ago
- Neo4j-based recommendation engine module with real-time and pre-computed recommendations.☆378Updated 4 years ago
- An Elasticsearch ingest processor to do named entity extraction using Apache OpenNLP☆272Updated 2 years ago
- Solr query parser plugin that performs proper query-time synonym expansion.☆150Updated 4 years ago
- Generates more or less realistic log data for testing simple aggregation queries.☆259Updated last year
- Java and REST APIs for working with time-representing tree in Neo4j☆208Updated 4 years ago
- Query preprocessor for Java-based search engines (Querqy Core and Solr implementation)☆184Updated last month
- Browser-driven explorer for lucene indexes☆74Updated 3 years ago
- A text tagger based on Lucene / Solr, using FST technology☆176Updated last year
- Dice Solr Plugins from Simon Hughes Dice.com☆87Updated 4 years ago
- An open-source, vendor-neutral data context service.☆160Updated 7 years ago
- Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.☆128Updated 9 years ago
- Fabric-based framework for deploying and managing SolrCloud clusters in the cloud.☆90Updated 6 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- ☆92Updated 9 years ago
- Entity Extraction Text Processor☆147Updated last year
- Entity resolution for Elasticsearch.☆160Updated 6 months ago
- ☆61Updated 9 months ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆155Updated 3 years ago
- Github mirror of "search/highlighter" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆103Updated last month
- Graphify is a Neo4j unmanaged extension used for document and text classification using graph-based hierarchical pattern recognition.☆379Updated 5 years ago
- A bundle of useful Elasticsearch plugins☆111Updated last year
- Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.☆446Updated last year
- Spark RDD with Lucene's query and entity linkage capabilities☆128Updated last month
- TinkerPop 3 implementation on Elasticsearch backend☆70Updated 9 years ago