dimkar121 / LSHDB
LSHDB is a parallel and distributed data engine, which relies on Locality-Sensitive Hashing and noSQL systems, for performing record linkage (and privacy-preserving record linkage) and similarity search tasks.
☆28Updated 2 years ago
Related projects: ⓘ
- Blazegraph Tinkerpop3 Implementation☆59Updated 3 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆70Updated 4 years ago
- Provided Guidance on Creating End to End Solutions for Common SILK Use Cases☆13Updated 8 years ago
- A framework for scalable graph computing.☆148Updated 6 years ago
- Demo application for GRADOOP operators☆23Updated 4 years ago
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆24Updated 6 years ago
- A Generalized Data Cleaning System☆47Updated 8 years ago
- Dynamic Distributed Dimensional Data Model☆40Updated 4 months ago
- A web based data mining workflow platform with real-time analysis capabilities☆48Updated last year
- A system and a Java API for large-scale graph processing based on Google's Pregel☆63Updated 11 years ago
- A toolkit for clustering web pages based on various similarity measures.☆32Updated 2 years ago
- Algorithms that build k-nearest neighbors graph (k-nn graph): Brute-force, NN-Descent,...☆34Updated 5 years ago
- High-security graph database☆62Updated 2 years ago
- An Exploration into Graph Databases☆28Updated 8 years ago
- KnowledgeStore☆20Updated 6 years ago
- ☆23Updated 4 years ago
- Stanford Entity-Resolution Framework☆23Updated 6 years ago
- Scalable Graph Mining☆61Updated last year
- A bunch of fancy soft string matching routines, with some accompanying datasets☆54Updated 7 years ago
- pythonic access to fastbit☆26Updated 6 years ago
- Vizlinc☆14Updated 8 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- GPU Acceleration for Apache Spark☆34Updated 9 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- ☆92Updated 8 years ago
- Using latent Dirichlet allocation (LDA) in Apache Lucene☆58Updated 11 years ago
- How to spot first stories on Twitter using Storm.☆124Updated 9 months ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- A framework to benchmark different graph databases, based on generated data from customizable schema, distribution, and size.☆26Updated 5 years ago