mitdbg / bigdataLinks
MIT Big Data Challenge
☆14Updated 11 years ago
Alternatives and similar repositories for bigdata
Users that are interested in bigdata are comparing it to the libraries listed below
Sorting:
- dllib is a distributed deep learning library running on Apache Spark☆32Updated 7 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 8 years ago
- A library of machine learning algorithms implemented using principles of functional programming.☆23Updated 8 years ago
- Reduce your data. A unix filter for algebird-powered aggregation.☆140Updated 8 years ago
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 3 years ago
- Sparse feature extraction with Spark☆30Updated 6 years ago
- DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.☆44Updated 10 years ago
- An Akka Extension for easy integration of spark and cassandra in Akka micro services.☆25Updated 10 years ago
- A chef cookbook for deploying spark☆30Updated 12 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- Reactive Outlier Detection Engine☆11Updated 10 years ago
- Cascading and Scalding wrapper for HBase with advanced read features☆54Updated 5 years ago
- VoltDB Click Stream Processing Example.☆16Updated 7 years ago
- GPU Acceleration for Apache Spark☆34Updated 9 years ago
- Spark MLlib code optimized to efficiently support sparse data☆51Updated 8 years ago
- Text Classification Engine☆36Updated 6 years ago
- Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.☆27Updated 9 years ago
- An API for Distributed Machine Learning☆155Updated 8 years ago
- Automatic offload of user-written Spark kernels to accelerators☆18Updated 8 years ago
- phData Pulse application log aggregation and monitoring☆13Updated 5 years ago
- Repository for SF QConf 2015 Workshop☆16Updated 8 months ago
- ReactiveLDA is a fast, lightweight implementation of the Latent Dirichlet Allocation (LDA) algorithm, using a parallel vanilla Gibbs samp…☆61Updated 10 years ago
- Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP☆15Updated 8 years ago
- Exploration Library in Java☆12Updated 2 years ago
- Graph Analytics Engine☆260Updated 10 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- A collection of efficient utilities for a data scientist.☆41Updated 10 years ago
- This is an introduction of Apache Spark DataFrames.☆41Updated 10 years ago
- A Scala framework to build derived datasets, aka batch views, of Telemetry data.☆35Updated 3 years ago