unchartedsoftware / ensemble-clustering
Uncharted Ensemble Clustering is a flexible multi-threaded clustering library for rapidly constructing tailored clustering solutions that leverage the different semantic aspects of heterogeneous data. The library can be used on a single machine using multi-threading or distributed computing using Spark.
☆32Updated 9 years ago
Alternatives and similar repositories for ensemble-clustering:
Users that are interested in ensemble-clustering are comparing it to the libraries listed below
- ☆20Updated 8 years ago
- ☆20Updated 7 years ago
- General Vectorization Lib for Machine Learning Tools☆31Updated 8 years ago
- A Text Classification API in Java originally developed by DigitalPebble Ltd. The API is independent from the ML implementations used and …☆48Updated 3 years ago
- The main - so far, only - repository for the SmileWide project.☆32Updated 8 years ago
- Hadoop MapReduce over Hive based implementation of attributed network pattern matching.☆40Updated 10 years ago
- NLP Utilities in Java☆43Updated 2 years ago
- SmallK: very fast data clustering tools☆14Updated 5 years ago
- An implementation of locality sensitive hashing with Hadoop☆57Updated 10 years ago
- ☆20Updated 7 years ago
- SNAP repository for Ringo☆14Updated 7 years ago
- A Java library for Stochastic Gradient Descent (SGD)☆21Updated 3 years ago
- Base components for Question Answering pipelines☆28Updated 2 years ago
- Pattern-of-Behavior Search Tool☆11Updated 2 years ago
- A parallel IRWLS library to solve SVMs and budgeted SVMs☆59Updated 7 years ago
- An implementation of gibbs sampling for Latent Dirichlet Allocation☆30Updated 13 years ago
- Library for building reproducible data pipelines to support experimentation☆20Updated 9 years ago
- Templates for projects based on top of H2O.☆37Updated 3 months ago
- Predictive analatics using deepLearning4j and Spark☆26Updated 8 years ago
- A tool for calculation semantic similarity between words from a text corpus based on lexico-syntactic patterns.☆28Updated 9 years ago
- Distributed implementation of Robust PLSA using Spark☆12Updated 3 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆92Updated 9 years ago
- from zero to storm cluster for realtime classification using sklearn☆12Updated 10 years ago
- Python implementation of nonparametric nearest-neighbor-based estimators for divergences between distributions.☆48Updated 7 years ago
- Java library for Concrete, a data serialization format for NLP☆6Updated 5 years ago
- Scalable inference for Correlated Topic Models☆30Updated 9 years ago
- Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP☆15Updated 8 years ago
- Solr Dictionary Annotator (Microservice for Spark)☆71Updated 5 years ago
- Using latent Dirichlet allocation (LDA) in Apache Lucene☆58Updated 12 years ago
- The information sieve for discrete variables.☆36Updated 8 years ago