apssouza22 / big-data-pipeline-lambda-arch
A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.
☆176Updated last year
Alternatives and similar repositories for big-data-pipeline-lambda-arch:
Users that are interested in big-data-pipeline-lambda-arch are comparing it to the libraries listed below
- This is the central repository for all materials related to Kafka Streams : Real-time Stream Processing! Book by Prashant Pandey.☆163Updated 4 years ago
- Supporting repository for the blog post at https://medium.com/@stephane.maarek/how-to-use-apache-kafka-to-transform-a-batch-pipeline-into…☆240Updated last year
- Spark Examples☆125Updated 3 years ago
- If you are planning or preparing for Apache Kafka Certification then this is the right place for you.There are many Apache Kafka Certific…☆34Updated 4 years ago
- How to build an awesome data engineering team☆99Updated 5 years ago
- This repository contains code for Spark Streaming☆21Updated 3 years ago
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆127Updated 2 years ago
- Apache Spark Course Material☆87Updated last year
- Apache Spark 3 - Structured Streaming Course Material☆44Updated 4 years ago
- Maven quick start for building Kafka Connect connectors.☆146Updated 4 years ago
- ☆47Updated 6 months ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆41Updated 5 years ago
- Simple stream processing pipeline☆98Updated 8 months ago
- ( These solutions tested on 4 node Hortonwork cluster on my laptop. Do not test on your production environment until you test... :)☆21Updated 4 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Updated 6 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆43Updated last year
- Apache Spark examples exclusively in Java☆100Updated last year
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Kafka Connect connector to stream data in real time from Twitter.☆126Updated 2 years ago
- Simple examle for Spark Streaming over Kafka topic☆106Updated 4 years ago
- ☆306Updated 6 years ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆42Updated 2 years ago
- Real-world Spark pipelines examples☆83Updated 6 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- Data engineering interviews Q&A for data community by data community☆63Updated 4 years ago
- 📚 Tech blogs & talks by companies that run Apache Flink in production☆162Updated 3 weeks ago
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆119Updated 3 years ago
- Spark Structured Streaming / Kafka / Cassandra / Elastic☆183Updated 2 years ago
- Spark style guide☆257Updated 4 months ago
- A proof of concept using Divolte, Kafka, Druid and Superset☆62Updated 4 years ago