ankurdave / kmeans-sparkLinks
A simple implementation of k-means clustering on the Spark cluster computing framework. See http://cs.berkeley.edu/~matei/spark.
☆27Updated 14 years ago
Alternatives and similar repositories for kmeans-spark
Users that are interested in kmeans-spark are comparing it to the libraries listed below
Sorting:
- Training materials for Strata, AMP Camp, etc☆149Updated 9 years ago
- edXSpark☆21Updated 9 years ago
- Locality Sensitive Hashing for Apache Spark☆196Updated 8 years ago
- ☆110Updated 8 years ago
- Assembly of fundamental statistics implemented based on Apache Spark☆31Updated 9 years ago
- Code for Packt Publishing's Scala Data Analysis Cookbook.☆48Updated 9 years ago
- A curated inventory of machine learning methods available on the Apache Spark platform, both in official and third party libraries.☆65Updated 8 years ago
- Additional useful algorithms that can be used with spark.☆24Updated 10 years ago
- Visualize streaming machine learning in Spark☆177Updated 8 years ago
- Zeppelin notebook examples☆25Updated 9 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Be…☆31Updated 10 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆16Updated 5 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 9 years ago
- Scripts to analyze Spark's performance☆136Updated 7 years ago
- ☆19Updated 9 years ago
- A Distributed Matrix Operations Library Built on Top of Spark☆107Updated 8 years ago
- An example of using Avro and Parquet in Spark SQL☆60Updated 9 years ago
- Spark 2.0 Scala Machine Learning examples☆77Updated 6 years ago
- Locality Sensitive Hashing for Apache Spark☆87Updated 3 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Updated 9 years ago
- Former GraphX development repository. GraphX has been merged into Apache Spark; please submit pull requests there.☆360Updated 2 years ago
- Distributed Streaming Matrix Factorization implemented on Spark for Recommendation Systems☆107Updated 9 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- tutorials and samples that show you how get the most out of IBM Analytics for Apache Spark☆79Updated 7 years ago
- Joins for skewed datasets in Spark☆57Updated 8 years ago
- ☆24Updated 10 years ago
- Example project to show how to use Spark to read and write Avro/Parquet files☆50Updated 12 years ago
- Ansible recipes for Berkeley Data Analytics Stack deployment☆16Updated 10 years ago
- Self-written notes that may be useful☆107Updated 9 years ago
- An implementation of Markov Clustering algorithm for Spark in Scala☆34Updated 8 years ago