dbusteed / kafka-spark-streaming-example
☆37Updated 5 years ago
Alternatives and similar repositories for kafka-spark-streaming-example:
Users that are interested in kafka-spark-streaming-example are comparing it to the libraries listed below
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆47Updated last year
- PySpark-ETL☆23Updated 5 years ago
- Design/Implement stream/batch architecture on NYC taxi data | #DE☆26Updated 3 years ago
- Testing Spark Structured Streaming anf Kafka with real data from traffic sensors☆16Updated 2 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆41Updated 5 years ago
- The Ultimate Hands-On Hadoop - Tame your Big Data!: https://www.udemy.com/the-ultimate-hands-on-hadoop-tame-your-big-data/☆8Updated 6 years ago
- This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging…☆76Updated 5 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 2 years ago
- PySpark Tutorial for Beginners on Google Colab: Hands-On Guide☆16Updated 4 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Airflow helm chart for AWS EKS☆18Updated 4 years ago
- Simple stream processing pipeline☆98Updated 8 months ago
- PySpark Cheatsheet☆36Updated 2 years ago
- ☆87Updated 2 years ago
- Developed an ETL pipeline for a Data Lake that extracts data from S3, processes the data using Spark, and loads the data back into S3 as …☆16Updated 5 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆84Updated 5 years ago
- ☆41Updated 7 months ago
- Classwork projects and home works done through Udacity data engineering nano degree☆74Updated last year
- Project for real-time anomaly detection using Kafka and python☆59Updated 2 years ago
- ☆148Updated 6 years ago
- This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆38Updated 11 months ago
- Repo which holds the materials for the EMR Zero To Hero☆27Updated 2 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆41Updated last year
- ETL pipeline using pyspark (Spark - Python)☆113Updated 4 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- Data engineering interviews Q&A for data community by data community☆64Updated 4 years ago
- End-to-end Kafka Streaming Examples on Databricks with Evolving Avro Schemas.☆9Updated 11 months ago
- This repository contains code for Spark Streaming☆21Updated 3 years ago