dimkar121 / LSHDB
LSHDB is a parallel and distributed data engine, which relies on Locality-Sensitive Hashing and noSQL systems, for performing record linkage (and privacy-preserving record linkage) and similarity search tasks.
☆31Updated 2 years ago
Alternatives and similar repositories for LSHDB:
Users that are interested in LSHDB are comparing it to the libraries listed below
- A framework for scalable graph computing.☆147Updated 6 years ago
- Apache NiFi NLP Processor☆18Updated last year
- Blazegraph Tinkerpop3 Implementation☆61Updated 4 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Dynamic Distributed Dimensional Data Model☆41Updated 11 months ago
- Provided Guidance on Creating End to End Solutions for Common SILK Use Cases☆13Updated 9 years ago
- How to spot first stories on Twitter using Storm.☆125Updated last year
- Graphulo: Accumulo library of matrix math primitives and graph algorithms☆78Updated 11 months ago
- A library to store metadata of relational databases including the schema, statistics, and integrity constraints.☆25Updated 6 years ago
- GPU Acceleration for Apache Spark☆34Updated 9 years ago
- ☆57Updated 2 years ago
- TensorDB: In-Database Tensor Manipulation with Tensor-Relational Query Plans☆20Updated 10 years ago
- Java parsers for different RDF serialisations + API + tools + JAX-RS integration☆20Updated 3 years ago
- ☆20Updated 8 years ago
- ☆33Updated 10 years ago
- phData Pulse application log aggregation and monitoring☆13Updated 4 years ago
- ☆23Updated 5 years ago
- Tools for building a Lucene index for Semantic Vectors☆21Updated 9 years ago
- Algorithms that build k-nearest neighbors graph (k-nn graph): Brute-force, NN-Descent,...☆34Updated 6 years ago
- Ductile DB is a graph database based on Hadoop/HBase which provides a vast set of features.☆13Updated 7 years ago
- A single docker image that combines Neo4j Mazerunner and Apache Spark GraphX into a powerful all-in-one graph processing engine☆46Updated 5 years ago
- Vizlinc☆14Updated 9 years ago
- ☆24Updated 9 years ago
- A system and a Java API for large-scale graph processing based on Google's Pregel☆64Updated 12 years ago
- Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP☆15Updated 8 years ago
- A web based data mining workflow platform with real-time analysis capabilities☆49Updated 2 years ago
- S2RDF (SPARQL on Spark for RDF) is a SPARQL query processor for Hadoop based on Spark SQL. It uses the relational interface of Spark for …☆13Updated 6 years ago
- A Text Classification API in Java originally developed by DigitalPebble Ltd. The API is independent from the ML implementations used and …☆48Updated 3 years ago
- Scalable Optical Character Recognition with Apache NiFi and Tesseract☆32Updated 8 years ago
- System for mining Wikipedia Usage data to read our collective mind☆21Updated 10 years ago