amanparmar17 / Kafka_PysparkLinks
Base Kafka Producer, consumer, flask api and PySpark Structured streaming Job
☆11Updated 3 years ago
Alternatives and similar repositories for Kafka_Pyspark
Users that are interested in Kafka_Pyspark are comparing it to the libraries listed below
Sorting:
- Testing Spark Structured Streaming anf Kafka with real data from traffic sensors☆16Updated 2 years ago
- ☆41Updated 11 months ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆38Updated last year
- Apache Spark using SQL☆14Updated 3 years ago
- This repository contains the necessary configuration files and DAGs (Directed Acyclic Graphs) for setting up a robust data engineering en…☆21Updated last year
- In this project, we setup and end to end data engineering using Apache Spark, Azure Databricks, Data Build Tool (DBT) using Azure as our …☆32Updated last year
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆41Updated last year
- End to end data engineering project☆56Updated 2 years ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆19Updated 2 years ago
- Simple ETL pipeline using Python☆26Updated 2 years ago
- This project introduces PySpark, a powerful open-source framework for distributed data processing. We explore its architecture, component…☆33Updated 9 months ago
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Updated 4 years ago
- Mastering Big Data Analytics with PySpark, Published by Packt☆160Updated 10 months ago
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆42Updated last year
- An end-to-end data engineering pipeline that fetches real-time YouTube analytics and streams them through Kafka for processing with ksqlD…☆12Updated last year
- Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.☆36Updated last year
- Building a Modern Data Lake with Minio, Spark, Airflow via Docker.☆20Updated last year
- ☆15Updated 3 years ago
- Apche Spark Structured Streaming with Kafka using Python(PySpark)☆40Updated 6 years ago
- This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessar…☆45Updated last year
- Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆55Updated last year
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆23Updated 3 years ago
- ☆39Updated 2 years ago
- This repo gives an introduction to setting up streaming analytics using open source technologies☆25Updated 2 years ago
- Big Data webapp using Chicago street congestion, crashes, red light violations, and speed camera violations☆40Updated 4 years ago
- ☆39Updated 2 years ago
- ☆21Updated last year
- Content related to Mastering Postgresql along with videos.☆16Updated 3 years ago
- This project provides an end-to-end data processing and visualization of visa numbers in Japan using PySpark and Plotly. The spark cluste…☆11Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆62Updated last year