Additional useful algorithms that can be used with spark.
☆24Dec 24, 2014Updated 11 years ago
Alternatives and similar repositories for SparkAlgorithms
Users that are interested in SparkAlgorithms are comparing it to the libraries listed below
Sorting:
- A primal-dual framework for distributed L1-regularized optimization☆37Apr 18, 2016Updated 9 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Jun 1, 2015Updated 10 years ago
- ☆56Aug 21, 2014Updated 11 years ago
- Data science repo to help others☆12Feb 10, 2016Updated 10 years ago
- Prescriptive Applications over Kite and Hadoop☆12Oct 14, 2015Updated 10 years ago
- Command line tool that transpiles scala code into java code.☆12Sep 26, 2015Updated 10 years ago
- Coding exercises for Apache Spark☆104Jun 4, 2015Updated 10 years ago
- Repo with sources for Spark blog posts and learning experiments in Spark☆14Oct 16, 2015Updated 10 years ago
- Omnivore Optimizer and Distributed CcT☆13Jun 17, 2016Updated 9 years ago
- An analysis on Aadhaar dataset using Mapreduce and Spark☆14Feb 28, 2018Updated 8 years ago
- Secondary sort and streaming reduce for Apache Spark☆78Jul 3, 2023Updated 2 years ago
- ☆12Apr 8, 2016Updated 9 years ago
- Quick summary: This code implements a spectral (third order tensor decomposition) learning method for learning LDA topic model on Spark.☆104Jul 2, 2018Updated 7 years ago
- The released version of Astro(Spark SQL on HBase) has been moved to:☆16Jul 23, 2015Updated 10 years ago
- Code for Springer Book: High Performance Distributed Computing: Case Studies with Hadoop, Scalding and Spark☆15Oct 6, 2017Updated 8 years ago
- Import Salesforce data into Hadoop HDFS in Avro format☆23Jan 8, 2020Updated 6 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Oct 27, 2015Updated 10 years ago
- Some Spark implementations of clustering algorithms.☆19Nov 13, 2018Updated 7 years ago
- Real-time query spark and visualise it as graph.☆24Oct 4, 2017Updated 8 years ago
- LocationSpark: A Distributed In-Memory Data Management System for Big Spatial Data☆43Jan 6, 2017Updated 9 years ago
- Automatic offload of user-written Spark kernels to accelerators☆18Oct 25, 2016Updated 9 years ago
- Activity recognition using Spark, Cassandra and MLlib☆43Jan 5, 2017Updated 9 years ago
- Spark Streaming with Scala and Akka Activator template☆44Jan 31, 2016Updated 10 years ago
- Locality Sensitive Hashing for Apache Spark☆196Nov 1, 2016Updated 9 years ago
- A Neural network implementation with Scala☆20Jul 17, 2016Updated 9 years ago
- Distributed Streaming Quantiles (for PySpark)☆38Jan 30, 2014Updated 12 years ago
- Big Spatial Data Processing using Spark☆146Mar 7, 2017Updated 8 years ago
- R package for inference on the Sharpe ratio.☆20Dec 21, 2024Updated last year
- Factorization Machines on Spark and Glint☆25Nov 7, 2016Updated 9 years ago
- Public Presentations☆24Apr 13, 2025Updated 10 months ago
- MPI-oriented extension of the Spark computational model☆24Jun 5, 2018Updated 7 years ago
- A connector for SingleStore and Spark☆162Sep 24, 2025Updated 5 months ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆146Jan 26, 2016Updated 10 years ago
- A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support…☆108Feb 1, 2018Updated 8 years ago
- Spark-based approximate nearest neighbor search using locality-sensitive hashing☆104Jul 5, 2016Updated 9 years ago
- Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.☆27Jun 19, 2016Updated 9 years ago
- FRED simulator and associated paper☆26Jan 15, 2016Updated 10 years ago
- Anonymizing Library for Apache Spark☆31Nov 9, 2023Updated 2 years ago
- Sparse feature extraction with Spark☆30Jul 25, 2018Updated 7 years ago