mitdbg / bigdataLinks
MIT Big Data Challenge
☆14Updated 11 years ago
Alternatives and similar repositories for bigdata
Users that are interested in bigdata are comparing it to the libraries listed below
Sorting:
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 8 years ago
- Cascading and Scalding wrapper for HBase with advanced read features☆54Updated 5 years ago
- Exploration Library in Java☆12Updated last year
- Example project which simulates an interesting analytics use case using MemSQL Pipelines.☆14Updated 8 years ago
- This is an introduction of Apache Spark DataFrames.☆41Updated 10 years ago
- Sparse feature extraction with Spark☆30Updated 6 years ago
- Thin reactive framework to provide and consume REST services☆48Updated 10 years ago
- Embedded Kafka for testing and quick prototyping.☆14Updated 9 years ago
- Road to Continous Upgrade☆15Updated last year
- An Akka Extension for easy integration of spark and cassandra in Akka micro services.☆25Updated 10 years ago
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 2 years ago
- dllib is a distributed deep learning library running on Apache Spark☆32Updated 7 years ago
- Code and Data Samples for Big Data Warehousing.☆10Updated 9 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- A collection of Scala graph libraries and adapters for graph databases.☆15Updated 8 years ago
- Reduce your data. A unix filter for algebird-powered aggregation.☆139Updated 8 years ago
- A library of machine learning algorithms implemented using principles of functional programming.☆23Updated 8 years ago
- A package full of linear algebra operators for Apache Spark MLlib's linalg package☆10Updated 9 years ago
- Mirror of Apache Bookkeeper☆15Updated 6 years ago
- scalding powered machine learning☆109Updated 10 years ago
- Stand-alone ANSI SQL for Cascading on Apache Hadoop☆48Updated 7 years ago
- A collection of efficient utilities for a data scientist.☆41Updated 10 years ago
- Alenka JDBC is a library for accessing and manipulating data with the open-source GPU database Alenka.☆19Updated 10 years ago
- Slides from talks and preprints of the publications.☆63Updated 5 years ago
- Sample custom Nifi processor to process tcpdump☆18Updated 9 years ago
- Machine Learning for Cascading☆82Updated 10 years ago
- GPU Acceleration for Apache Spark☆34Updated 9 years ago
- Akka Cluster for Value-at-Risk calculation☆14Updated 11 years ago
- Use Cascading Taps and Scalding DSL with Spark☆49Updated 8 years ago
- Spark MLlib code optimized to efficiently support sparse data☆51Updated 8 years ago