jeoffreylim / maelstrom
Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream processing), scalable (consumes messges at Spark worker nodes), and is extremely reliable.
☆22Updated 8 years ago
Alternatives and similar repositories for maelstrom:
Users that are interested in maelstrom are comparing it to the libraries listed below
- A High Performance Cluster Consumer for Kafka that creates Avro (boom) files in Hadoop in time based directory paths☆42Updated 8 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Collection of generic Apache Flink operators☆17Updated 7 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- A distributed generic query layer for Apache Kafka Interactive Queries☆26Updated 7 years ago
- Camus Compressor merges files created by Camus and saves them in a compressed format.☆12Updated 2 years ago
- Cascading on Apache Flink®☆54Updated last year
- Flink Examples☆39Updated 8 years ago
- Apache Flink as a Cloudera Manager Service☆12Updated 9 years ago
- ☆26Updated 5 years ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Integrate Grafana with Ambari Metrics System☆27Updated 4 months ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Updated 8 years ago
- A set of tools to ease working with Zookeeper and Kafka.☆23Updated 9 years ago
- Flink performance tests☆28Updated 5 years ago
- Real-time analytics in Apache Flume☆52Updated 9 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- Toolkit that can bundle any Spring Boot application into an Apache Ambari Service, enabling Ambari to provision, manage and monitor the s…☆13Updated 9 years ago
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- ☆20Updated 8 years ago
- Easy metrics collection for Storm topologies using Coda Hale Metrics☆100Updated 11 years ago
- Example using Grafana with Druid☆11Updated 10 years ago
- Kafka Connect Integration with Kafka Streams + KSQL☆11Updated 6 years ago
- Hadoop MapReduce tool to convert Avro data files to Parquet format.☆34Updated 11 years ago
- This is a datasource implementation for quick query in Kafka with Spark☆9Updated last year
- An example of using Flink for Fault-Tolerant Stream Processing☆12Updated 6 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23Updated 6 years ago
- The Scala Rule Engine☆41Updated 3 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Updated 8 years ago