derrickburns / generalized-kmeans-clusteringLinks
Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL, Itakura-Saito, L1, etc). 6 algorithms, 740 tests, cross-version persistence. Drop-in replacement for MLlib with mathematically correct distance functions for probability distributions, spectral data, and count data.
☆341Updated last month
Alternatives and similar repositories for generalized-kmeans-clustering
Users that are interested in generalized-kmeans-clustering are comparing it to the libraries listed below
Sorting:
- ☆712Updated 3 months ago
- Generate Cool-Looking Mazes and Animations Illustrating the A* Pathfinding Algorithm☆177Updated 9 months ago
- A Kurtosis package for Python data engineers, deploying a Jupyter notebook along with a configurable set of databases, and a visualizatio…☆109Updated last year
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆286Updated 2 months ago
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆255Updated 2 years ago
- Wayeb is a Complex Event Processing and Forecasting (CEP/F) engine written in Scala.☆150Updated 2 years ago
- A Detailed Introduction to My Favorite Statistical Measure, Hoeffding's D☆100Updated last year
- Visualize text embeddings☆40Updated 2 years ago
- R.L. methods and techniques.☆199Updated last year
- A BERT that you can train on a (gaming) laptop.☆209Updated 2 years ago
- Optimally allocate poker chips using constrained, nonlinear optimization☆174Updated 11 months ago
- Docker-based inference engine for AMD GPUs☆230Updated last year
- Run and explore Llama models locally with minimal dependencies on CPU☆190Updated last year
- convert a scikit-learn decision tree into a Keras model☆39Updated 2 years ago
- Testing various image matching algorithms' performance on the Pinecone vector DB☆43Updated 2 years ago
- Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems☆106Updated 7 months ago
- Proof of thought : LLM-based reasoning using Z3 theorem proving with multiple backend support (SMT2 and JSON DSL)☆356Updated last month
- Automated, smooth, N'th order derivatives of non-uniformly sampled time series data☆228Updated last year
- Implement recursion using English as the programming language and an LLM as the runtime.☆238Updated 2 years ago
- Grow virtual creatures in static and physics simulated environments.☆53Updated last year
- ☆280Updated 5 months ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆473Updated 8 years ago
- Lightweight Pandas monkey-patch that adds async support to map, apply, applymap, aggregate, and transform, enabling seamless handling of …☆131Updated 5 months ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipeli…☆648Updated 3 weeks ago
- Examples and guides for using the VLM Run API☆297Updated last week
- Animating R1's thoughts.☆386Updated 9 months ago
- Agent Based Model on GPU using CUDA 12.2.1 and OpenGL 4.5 (CUDA OpenGL interop) on Windows/Linux☆75Updated 8 months ago
- Radient turns many data types (not just text) into vectors for similarity search, RAG, regression analysis, and more.☆280Updated last month
- ☆165Updated last year