mtpatter / time-series-kafka-demoLinks
Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detailed blog post published on Towards Data Science.
☆40Updated 4 years ago
Alternatives and similar repositories for time-series-kafka-demo
Users that are interested in time-series-kafka-demo are comparing it to the libraries listed below
Sorting:
- A Series of Notebooks on how to start with Kafka and Python☆152Updated 9 months ago
- A course by DataTalks Club that covers Spark, Kafka, Docker, Airflow, Terraform, DBT, Big Query etc☆14Updated 3 years ago
- Delta-Lake, ETL, Spark, Airflow☆48Updated 3 years ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Delta Lake Documentation☆51Updated last year
- Code snippets for Data Engineering Design Patterns book☆294Updated 8 months ago
- ☆88Updated 3 years ago
- Project for real-time anomaly detection using Kafka and python☆58Updated 3 years ago
- This project focuses on building a robust data pipeline using Apache Airflow to automate the ingestion of weather data from the OpenWeath…☆22Updated 2 years ago
- ☆44Updated last year
- Repository for Data Engineering Interview Series☆33Updated last year
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 3 years ago
- used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline☆29Updated 2 years ago
- Simple stream processing pipeline☆110Updated last year
- Duke MIDS: Data Engineering and DataOps Course☆67Updated 11 months ago
- This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging…☆95Updated 6 years ago
- A Postgres data warehouse for processing synthetic data using IAC principles☆19Updated 2 years ago
- Docker Airflow - Contains a docker compose file for Airflow 2.0☆69Updated 3 years ago
- Materials of the Official Helm Chart Webinar☆27Updated 4 years ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆92Updated last year
- Code for blog at: https://www.startdataengineering.com/post/docker-for-de/☆40Updated last year
- Produce Kafka messages, consume them and upload into Cassandra, MongoDB.☆42Updated 2 years ago
- End to end data engineering project☆57Updated 3 years ago
- Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; K…☆68Updated last month
- Apache Airflow Best Practices, published by Packt☆51Updated last year
- Some recipes for data engineering with Python☆23Updated 4 years ago
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆143Updated 2 years ago
- Cloned by the `dbt init` task☆62Updated last year
- Code for dbt tutorial☆165Updated 3 months ago