lintool / bespin
Reference implementations of data-intensive algorithms in MapReduce and Spark
☆82Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for bespin
- Training materials for Strata, AMP Camp, etc☆150Updated 9 years ago
- A curated inventory of machine learning methods available on the Apache Spark platform, both in official and third party libraries.☆65Updated 7 years ago
- ☆92Updated 9 years ago
- Assembly of fundamental statistics implemented based on Apache Spark☆31Updated 8 years ago
- Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark☆147Updated 8 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 5 years ago
- An implementation of Markov Clustering algorithm for Spark in Scala☆34Updated 7 years ago
- Locality Sensitive Hashing for Apache Spark☆196Updated 8 years ago
- Oracle Data Science Bootcamp 2014☆25Updated 9 years ago
- Course repository for Applied Natural Language Processing☆124Updated 11 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- Approximate Nearest Neighbors in Spark☆174Updated 3 years ago
- Spark-based approximate nearest neighbor search using locality-sensitive hashing☆104Updated 8 years ago
- A Spark-based LexRank extractive summarizer for text documents☆19Updated 8 years ago
- Locality Sensitive Hashing for Apache Spark☆88Updated 2 years ago
- Practical examples of using Apache Spark in several different use cases☆104Updated 8 years ago
- Elasticsearch Latent Semantic Indexing experimentation☆33Updated 5 years ago
- An API for Distributed Machine Learning☆154Updated 8 years ago
- ADMM based large scale logistic regression☆335Updated 11 months ago
- Distributed Streaming Matrix Factorization implemented on Spark for Recommendation Systems☆107Updated 8 years ago
- Distributed Matrix Library☆70Updated 7 years ago
- Reference Architectures for Apache Spark☆38Updated 7 years ago
- ☆24Updated 9 years ago
- Lintools: tools by @lintool☆22Updated 5 years ago
- An efficient updatable key-value store for Apache Spark☆250Updated 7 years ago
- Affinity Propagation on Spark☆19Updated 3 years ago
- Topic Modeling with LDA in Scala and Spark☆31Updated 6 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆62Updated 8 years ago
- Simple example on how to use Naive Bayes on Spark using the popular Reuters 21578 dataset☆23Updated 10 years ago