newfront / spark-summit-2018
Spark Application : Spark Summit 2018 : Streaming Trend Discovery
☆11Updated 6 years ago
Alternatives and similar repositories for spark-summit-2018:
Users that are interested in spark-summit-2018 are comparing it to the libraries listed below
- ☆9Updated 9 years ago
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 9 years ago
- Few things we've met during our etl project based on spark☆24Updated 6 years ago
- An extension of the kafka-python package that adds features like multiprocess consumers.☆39Updated last year
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- ☆14Updated 8 years ago
- Tool for exploring data on an Apache Kafka cluster☆42Updated 4 years ago
- Ranger is contextual data generator used to make sensible data for integration tests or to play with it in the database☆59Updated 4 years ago
- Apache Spark Awesome List☆14Updated 8 years ago
- Kafka sink connector for streaming messages to PostgreSQL☆90Updated 4 years ago
- Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.☆18Updated 7 years ago
- Development repository for the kafka cookbook☆92Updated 2 months ago
- A schema store service that tracks and manages all the schemas used in the Data Pipeline☆87Updated 3 years ago
- ☆76Updated 8 years ago
- Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)☆97Updated 2 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Updated 9 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- A/B experiments service☆33Updated this week
- recordbus: mysql binlog to apache kafka☆80Updated 9 years ago
- Exelixi is a distributed framework based on Apache Mesos, mostly implemented in Python using gevent for high-performance concurrency. It …☆133Updated 11 years ago
- Annotation driven Java object writer for ORC with runtime code generation for speed.☆21Updated last year
- A collection of datasets and databases☆24Updated 6 years ago
- ☆7Updated 10 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Updated 9 years ago
- INACTIVE: A PostgreSQL extension to produce messages to Apache Kafka.☆111Updated 9 years ago
- Cascading on Apache Flink®☆54Updated last year
- AWS bootstrap scripts for Mozilla's flavoured Spark setup.☆47Updated 5 years ago
- Telecom scenarios implemented with streaming techniques☆11Updated last year
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 7 years ago