mtpatter / time-series-kafka-demo
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.
☆39Updated 3 years ago
Alternatives and similar repositories for time-series-kafka-demo
Users that are interested in time-series-kafka-demo are comparing it to the libraries listed below
Sorting:
- A Postgres data warehouse for processing synthetic data using IAC principles☆17Updated 2 years ago
- End-to-end data engineer project☆18Updated last year
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆37Updated last year
- Materials of the Official Helm Chart Webinar☆27Updated 3 years ago
- This project serves as a comprehensive guide to building an end-to-end data engineering pipeline using TCP/IP Socket, Apache Spark, OpenA…☆35Updated last year
- A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousin…☆15Updated 4 years ago
- A Series of Notebooks on how to start with Kafka and Python☆154Updated 2 months ago
- Spark all the ETL Pipelines☆32Updated last year
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆68Updated last year
- ☆87Updated 2 years ago
- DataTalks.Club's Data Engineering Zoomcamp Project☆23Updated 2 years ago
- Simple stream processing pipeline☆102Updated 11 months ago
- build dw with dbt☆44Updated 6 months ago
- ☆16Updated last year
- Data Engineering Bootcamp☆27Updated last week
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆37Updated last year
- Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.☆35Updated last year
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- Databricks Certified Associate Spark Developer preparation toolkit to setup single node Standalone Spark Cluster along with material in t…☆30Updated last year
- ☆12Updated 3 years ago
- Essential PySpark for Scalable Data Analytics, published by Packt☆45Updated 2 years ago
- Scaling Machine Learning in Three Week course in a collaboration with O'Reilly following the guidance of Adi Polak's book - Scaling Machi…☆23Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆46Updated last year
- ☆18Updated last year
- ☆12Updated 3 years ago
- Solution to all projects of Udacity's Data Engineering Nanodegree: Data Modeling with Postgres & Cassandra, Data Warehouse with Redshift,…☆56Updated 2 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 9 months ago
- code snippet for analytics sessions☆34Updated 3 years ago
- End-to-end ELT data engineering project☆21Updated 2 years ago
- Duke MIDS: Data Engineering and DataOps Course☆66Updated 4 months ago