twitter / summingbirdLinks
Streaming MapReduce with Scalding and Storm
☆2,131Updated 3 years ago
Alternatives and similar repositories for summingbird
Users that are interested in summingbird are comparing it to the libraries listed below
Sorting:
- Distributed Prometheus time series database☆1,453Updated last week
- A Scala API for Cascading☆3,519Updated 2 years ago
- Abstract Algebra for Scala☆2,295Updated 3 weeks ago
- Lightweight real-time big data streaming engine over Akka☆759Updated 3 years ago
- Reversible conversions between types☆658Updated 9 months ago
- Mirror of Apache Samza☆832Updated 4 months ago
- A Thrift parser/generator☆797Updated 5 months ago
- I/O and Microservice library for Scala☆1,136Updated 4 years ago
- Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…☆1,036Updated 2 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆474Updated 8 years ago
- Storehaus is a library that makes it easy to work with asynchronous key value stores☆466Updated 5 years ago
- Cassovary is a simple big graph processing library for the JVM☆1,051Updated 3 years ago
- A Bulk Data Pipeline out of Cassandra☆323Updated 6 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,137Updated 2 years ago
- Tranquility helps you send real-time event streams to Druid and handles partitioning, replication, service discovery, and schema rollover…☆516Updated 5 years ago
- KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apach…☆1,182Updated 8 years ago
- LinkedIn's previous generation Kafka to HDFS pipeline.☆883Updated 5 years ago
- Netflix's distributed Data Pipeline☆798Updated 2 years ago
- BlinkDB: Sub-Second Approximate Queries on Very Large Data.☆660Updated 11 years ago
- Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning☆1,785Updated 4 years ago
- Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.☆852Updated 4 years ago
- Mirror of Apache Apex core☆350Updated 4 years ago
- Mirror of Apache Gearpump (Incubating)☆295Updated 7 years ago
- Scala extensions for the Kryo serialization library☆616Updated last year
- [PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streamin…☆724Updated 3 years ago
- Fast, testable, Scala services built on TwitterServer and Finagle☆2,269Updated 3 weeks ago
- Akka Streams & Akka HTTP for Large-Scale Production Deployments☆1,435Updated last year
- DEPRECATED. Zeppelin has moved to Apache. Please make pull request there☆408Updated 8 years ago
- Apache Spark to Apache Cassandra connector☆1,944Updated 4 months ago
- Scala School 2☆343Updated 3 years ago