mitdbg / bigdataLinks

MIT Big Data Challenge

☆14

Alternatives and similar repositories for bigdata

Users that are interested in bigdata are comparing it to the libraries listed below

Sorting:

Lewuathe / dllib
dllib is a distributed deep learning library running on Apache Spark
☆32Updated 7 years ago
memsql / streamliner-examples
Example code for building your own MemSQL Streamliner Pipelines
☆23Updated 8 years ago
malcolmgreaves / fp4ml
A library of machine learning algorithms implemented using principles of functional programming.
☆23Updated 8 years ago
avibryant / simmer
Reduce your data. A unix filter for algebird-powered aggregation.
☆140Updated 8 years ago
TIBCOSoftware / snappy-examples
Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.
☆32Updated 3 years ago
collectivemedia / modelmatrix
Sparse feature extraction with Spark
☆30Updated 6 years ago
h2oai / h2o-sparkling
DEPRECATED! Use https://github.com/h2oai/sparkling-water repository! H2O and Spark interoperability based on Tachyon.
☆44Updated 10 years ago
helena / spark-cassandra
An Akka Extension for easy integration of spark and cassandra in Akka micro services.
☆25Updated 10 years ago
holdenk / chef-cookbook-spark
A chef cookbook for deploying spark
☆30Updated 12 years ago
holdenk / fastdataprocessingwithsparkexamples
Examples for Fast Data Processing with Spark
☆59Updated 11 years ago
skrusche63 / spark-outlier
Reactive Outlier Detection Engine
☆11Updated 10 years ago
ParallelAI / SpyGlass
Cascading and Scalding wrapper for HBase with advanced read features
☆54Updated 5 years ago
VoltDB / app-fastdata
VoltDB Click Stream Processing Example.
☆16Updated 7 years ago
adobe-research / spark-gpu
GPU Acceleration for Apache Spark
☆34Updated 9 years ago
intel-spark / SparseML
Spark MLlib code optimized to efficiently support sparse data
☆51Updated 8 years ago
apache / predictionio-template-text-classifier
Text Classification Engine
☆36Updated 6 years ago
giorgioinf / twitter-stream-ml
Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.
☆27Updated 9 years ago
amplab / MLI
An API for Distributed Machine Learning
☆155Updated 8 years ago
agrippa / spark-swat
Automatic offload of user-written Spark kernels to accelerators
☆18Updated 8 years ago
phdata / pulse
phData Pulse application log aggregation and monitoring
☆13Updated 5 years ago
h2oai / qcon2015
Repository for SF QConf 2015 Workshop
☆16Updated 8 months ago
kifi / ReactiveLDA
ReactiveLDA is a fast, lightweight implementation of the Latent Dirichlet Allocation (LDA) algorithm, using a parallel vanilla Gibbs samp…
☆61Updated 10 years ago
dhwajraj / spark-twitter-named-entity
Named Entity Extraction on Twitter Stream using Apache Spark Streaming and Stanford CoreNLP
☆15Updated 8 years ago
microsoft / mwt-ds-explore-java
Exploration Library in Java
☆12Updated 2 years ago
thinkaurelius / faunus
Graph Analytics Engine
☆260Updated 10 years ago
tresata / spark-scalding
Use Cascading Taps and Scalding DSL with Spark
☆49Updated 8 years ago
memsql / streamliner-starter
Starter project for building MemSQL Streamliner Pipelines
☆32Updated 8 years ago
LinkedInAttic / datacl
A collection of efficient utilities for a data scientist.
☆41Updated 10 years ago
yu-iskw / spark-dataframe-introduction
This is an introduction of Apache Spark DataFrames.
☆41Updated 10 years ago
mozilla / telemetry-batch-view
A Scala framework to build derived datasets, aka batch views, of Telemetry data.
☆35Updated 3 years ago