derrickburns / generalized-kmeans-clusteringLinks
Spark library for generalized K-Means clustering. Supports general Bregman divergences. Suitable for clustering probabilistic data, time series data, high dimensional data, and very large data.
☆302Updated 2 months ago
Alternatives and similar repositories for generalized-kmeans-clustering
Users that are interested in generalized-kmeans-clustering are comparing it to the libraries listed below
Sorting:
- ☆580Updated last month
- Locality Sensitive Hashing for Apache Spark☆195Updated 8 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆147Updated 9 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆472Updated 8 years ago
- k-Nearest Neighbors algorithm on Spark☆240Updated last year
- An implementation of DBSCAN runing on top of Apache Spark☆183Updated 7 years ago
- Scalable Machine Learning in Scalding☆360Updated 7 years ago
- Automated, smooth, N'th order derivatives of non-uniformly sampled time series data☆226Updated 7 months ago
- Lamport's Bakery Algorithm Demonstrated in Python☆97Updated last year
- DBSCAN clustering algorithm on top of Apache Spark☆260Updated 7 years ago
- Generate Cool-Looking Mazes and Animations Illustrating the A* Pathfinding Algorithm☆177Updated 3 months ago
- Distributed Deep Learning on Spark☆402Updated 8 years ago
- Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems☆103Updated 2 months ago
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆251Updated last year
- Generic Implementation of Consensus ADMM over Spark☆84Updated 8 years ago
- An efficient updatable key-value store for Apache Spark☆251Updated 8 years ago
- Spark-based approximate nearest neighbor search using locality-sensitive hashing☆104Updated 8 years ago
- Visualize text embeddings☆40Updated last year
- The Nak Machine Learning Library☆342Updated 7 years ago
- Distributed decision tree ensemble learning in Scala☆393Updated 6 years ago
- Self-written notes that may be useful☆107Updated 9 years ago
- BigTable, Document and Graph Database with Full Text Search☆186Updated 7 years ago
- ☆111Updated 8 years ago
- Zen aims to provide the largest scale and the most efficient machine learning platform on top of Spark, including but not limited to logi…☆170Updated 6 years ago
- Building Annoy Index on Apache Spark☆72Updated 4 years ago
- A Detailed Introduction to My Favorite Statistical Measure, Hoeffding's D☆96Updated last year
- Global Vectors for Word Representation on spark☆35Updated 10 years ago
- Optimally allocate poker chips using constrained, nonlinear optimization☆174Updated 5 months ago
- A Kurtosis package for Python data engineers, deploying a Jupyter notebook along with a configurable set of databases, and a visualizatio…☆109Updated last year
- Better Bookmarks Search w/ Transformers☆194Updated last year