twitter / summingbird
Streaming MapReduce with Scalding and Storm
☆2,139Updated 2 years ago
Related projects: ⓘ
- A Scala API for Cascading☆3,495Updated last year
- Distributed Prometheus time series database☆1,428Updated this week
- Abstract Algebra for Scala☆2,289Updated 3 weeks ago
- Lightweight real-time big data streaming engine over Akka☆763Updated 2 years ago
- Project SnappyData - memory optimized analytics database, based on Apache Spark™ and Apache Geode™. Stream, Transact, Analyze, Predict in…☆1,039Updated last year
- Mirror of Apache Samza☆811Updated 3 weeks ago
- Reversible conversions between types☆657Updated 3 weeks ago
- I/O and Microservice library for Scala☆1,143Updated 3 years ago
- Wonderful reusable code from Twitter☆2,686Updated 3 weeks ago
- A Thrift parser/generator☆790Updated 4 months ago
- Avro Data Source for Apache Spark☆539Updated 5 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆468Updated 7 years ago
- CSV Data Source for Apache Spark 1.x☆1,053Updated 5 years ago
- LinkedIn's previous generation Kafka to HDFS pipeline.☆882Updated 4 years ago
- Fast, testable, Scala services built on TwitterServer and Finagle☆2,272Updated 4 months ago
- Mirror of Apache Gearpump (Incubating)☆298Updated 6 years ago
- Livy is an open source REST interface for interacting with Apache Spark from anywhere☆1,008Updated last year
- BlinkDB: Sub-Second Approximate Queries on Very Large Data.☆660Updated 10 years ago
- ☆399Updated this week
- KillrWeather is a reference application (work in progress) showing how to easily integrate streaming and batch data processing with Apach…☆1,182Updated 7 years ago
- Spark reference applications☆656Updated 7 months ago
- Mirror of Apache Apex core☆350Updated 3 years ago
- Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.☆1,139Updated last year
- Scala extensions for the Kryo serialization library☆608Updated 3 weeks ago
- [PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streamin…☆725Updated 2 years ago
- Akka Streams & Akka HTTP for Large-Scale Production Deployments☆1,433Updated 5 months ago
- Twitter's Effective Scala Guide☆2,243Updated last year
- Netflix's distributed Data Pipeline☆794Updated last year
- Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.☆855Updated 3 years ago
- Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning☆1,786Updated 3 years ago