dimkar121 / LSHDB
LSHDB is a parallel and distributed data engine, which relies on Locality-Sensitive Hashing and noSQL systems, for performing record linkage (and privacy-preserving record linkage) and similarity search tasks.
☆29Updated 2 years ago
Alternatives and similar repositories for LSHDB:
Users that are interested in LSHDB are comparing it to the libraries listed below
- A web based data mining workflow platform with real-time analysis capabilities☆49Updated 2 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Blazegraph Tinkerpop3 Implementation☆61Updated 4 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 4 years ago
- ☆23Updated 5 years ago
- A Java framework to build semantics-aware autoencoder neural network from a knowledge-graph.☆13Updated 7 years ago
- How to spot first stories on Twitter using Storm.☆125Updated last year
- A java library for stored queries☆16Updated last year
- Stanford Entity-Resolution Framework☆23Updated 6 years ago
- Dynamic Distributed Dimensional Data Model☆41Updated 8 months ago
- Provided Guidance on Creating End to End Solutions for Common SILK Use Cases☆13Updated 9 years ago
- ☆20Updated 7 years ago
- High-security graph database☆62Updated 2 years ago
- A framework to benchmark different graph databases, based on generated data from customizable schema, distribution, and size.☆26Updated 5 years ago
- Algorithms that build k-nearest neighbors graph (k-nn graph): Brute-force, NN-Descent,...☆34Updated 5 years ago
- A bunch of fancy soft string matching routines, with some accompanying datasets☆56Updated 7 years ago
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆25Updated 6 years ago
- Java port of TLSH (Trend Micro Locality Sensitive Hash)☆20Updated 3 years ago
- An Exploration into Graph Databases☆28Updated 9 years ago
- Fast and robust NLP components implemented in Java.☆52Updated 4 years ago
- pythonic access to fastbit☆26Updated 6 years ago
- Templates for projects based on top of H2O.☆37Updated 2 months ago
- Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).☆15Updated 4 years ago
- Uncharted Ensemble Clustering is a flexible multi-threaded clustering library for rapidly constructing tailored clustering solutions that…☆32Updated 9 years ago
- A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasets☆14Updated 2 years ago
- A comparative benchmark between relational database systems and their graph based counterpart.☆37Updated 7 years ago
- ☆41Updated 7 years ago
- A Generalized Data Cleaning System☆49Updated 8 years ago