YahooArchive / simplified-lambda
☆36Updated 9 years ago
Alternatives and similar repositories for simplified-lambda:
Users that are interested in simplified-lambda are comparing it to the libraries listed below
- Sparse feature extraction with Spark☆30Updated 6 years ago
- Machine learning and natural language processing with Apache Pig☆53Updated 11 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Updated 9 years ago
- This is an introduction of Apache Spark DataFrames.☆41Updated 9 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 7 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- Set of real time stream processing algorithms that can be used by big data streaming platform☆72Updated 4 years ago
- Muppet☆126Updated 3 years ago
- Integration of Samza and Luwak☆99Updated 10 years ago
- A/B experiments service☆33Updated this week
- Distributed Matrix Library☆70Updated 8 years ago
- Scala client for the Lightning data visualization server (WIP)☆47Updated 5 years ago
- Experiments with the GDELT dataset and Cassandra schemas.☆25Updated 9 years ago
- Deprecated - Check out MemSQL Pipelines instead!☆8Updated 7 years ago
- A Cascading Workflow Visualizer☆83Updated last year
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- Cascading on Apache Flink®☆54Updated last year
- An Apache Spark-shell backend for IPython☆105Updated 3 years ago
- ReactiveLDA is a fast, lightweight implementation of the Latent Dirichlet Allocation (LDA) algorithm, using a parallel vanilla Gibbs samp…☆61Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 9 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 11 years ago
- Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"☆49Updated 14 years ago
- Hadoop mapreduce job to bulk load data into Cassandra☆75Updated 2 years ago
- Serving system for batch generated data sets☆176Updated 7 years ago
- Apache Spark jobs such as Principal Coordinate Analysis.☆74Updated 8 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- ☆9Updated 9 years ago
- Movie recommendations and more in MapReduce and Scalding☆117Updated 12 years ago
- An Akka Extension for easy integration of spark and cassandra in Akka micro services.☆25Updated 10 years ago