derrickburns / generalized-kmeans-clusteringView external linksLinks
Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL, Itakura-Saito, L1, etc). 6 algorithms, 740 tests, cross-version persistence. Drop-in replacement for MLlib with mathematically correct distance functions for probability distributions, spectral data, and count data.
☆341Updated this week
Alternatives and similar repositories for generalized-kmeans-clustering
Users that are interested in generalized-kmeans-clustering are comparing it to the libraries listed below
Sorting:
- A package full of linear algebra operators for Apache Spark MLlib's linalg package☆10Sep 9, 2015Updated 10 years ago
- Locality Sensitive Hashing for Apache Spark☆196Nov 1, 2016Updated 9 years ago
- A versatile and powerful data platform allowing interactive searches, dashboards, alerts, and more.☆26Sep 12, 2025Updated 5 months ago
- An email segmentation system (reference implementation of ECIR 2018 paper)☆10Oct 21, 2019Updated 6 years ago
- Gust is a set of GPU extensions for Breeze.☆32Apr 10, 2015Updated 10 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- Monte Carlo leak diagnostic for Linux binaries☆21Apr 16, 2019Updated 6 years ago
- Automatically generated text for brainstorming/mindmapping purposes.☆25Jul 15, 2023Updated 2 years ago
- Generic Implementation of Consensus ADMM over Spark☆84Jul 8, 2016Updated 9 years ago
- Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipeli…☆651Feb 1, 2026Updated 2 weeks ago
- Fast similarity search using DuckDB☆146Oct 30, 2024Updated last year
- Visualize streaming machine learning in Spark☆177Jun 29, 2017Updated 8 years ago
- ☆721Aug 15, 2025Updated 6 months ago
- SBT Plugins for AI2 projects☆24Oct 14, 2022Updated 3 years ago
- High-Performance Klong array language in Python.☆312Feb 5, 2026Updated last week
- Real Time Analytics and Data Pipelines based on Spark Streaming☆531Oct 24, 2019Updated 6 years ago
- Text Mining Library with a focus on Latent Semantic Analysis☆12Aug 4, 2013Updated 12 years ago
- Real-time YouTube comment sentiment analysis using Kafka, Spark, and Streamlit dashboard.☆10Oct 2, 2024Updated last year
- Benchmarks of artificial neural network library for Spark MLlib☆11Dec 3, 2015Updated 10 years ago
- Wolfe Language and Engine☆134Mar 24, 2017Updated 8 years ago
- A Neural network implementation with Scala☆20Jul 17, 2016Updated 9 years ago
- Algebraic enhancements for GEMM & AI accelerators☆287Feb 28, 2025Updated 11 months ago
- R.L. methods and techniques.☆199Jan 23, 2026Updated 3 weeks ago
- Experiments with applying Fourier transofrms to various plane-filling curves and patterns☆65Apr 17, 2023Updated 2 years ago
- A BERT that you can train on a (gaming) laptop.☆209Sep 8, 2023Updated 2 years ago
- Lamport's Bakery Algorithm Demonstrated in Python☆95Jan 19, 2024Updated 2 years ago
- Virtual Control Admin, a free Openvz html5 panel☆11Aug 5, 2020Updated 5 years ago
- a very fast parser for sparse matrix at libsvm format☆10Nov 13, 2017Updated 8 years ago
- distinct counters and aggregate functions for distinct estimation in PostgreSQL☆17Nov 3, 2019Updated 6 years ago
- A reference implementation for a weave data structure to allow quick reconstruction of old versions of a compressed repository in version…☆17Jan 2, 2016Updated 10 years ago
- The format that's super!☆35Updated this week
- Docker-based inference engine for AMD GPUs☆231Oct 7, 2024Updated last year
- A cookiecutter template for a Sphinx extension☆28Jan 11, 2024Updated 2 years ago
- A reimplementation of https://github.com/otiai10/gosseract without CGo, running Tesseract compiled to WASM with Wazero☆154Jun 23, 2025Updated 7 months ago
- Splash Project for parallel stochastic learning☆93Jun 16, 2017Updated 8 years ago
- Animating R1's thoughts.☆384Feb 17, 2025Updated last year
- Distributed lbfgs on Apache Spark☆11Sep 25, 2020Updated 5 years ago
- Simple implementation of KMeans clustering on Flink, using iterations☆10Nov 15, 2018Updated 7 years ago
- Self-hostable headless QR code generator☆18Feb 5, 2026Updated last week