Anant / example-airflow-and-sparkLinks
☆12Updated 3 years ago
Alternatives and similar repositories for example-airflow-and-spark
Users that are interested in example-airflow-and-spark are comparing it to the libraries listed below
Sorting:
- A data pipeline with Kafka, Spark Streaming, dbt, Docker, Airflow, and GCP!☆12Updated 2 years ago
- Docker with Airflow and Spark standalone cluster☆261Updated 2 years ago
- End-to-end Kafka Streaming Examples on Databricks with Evolving Avro Schemas.☆9Updated last year
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Updated 3 years ago
- The resources of the preparation course for Databricks Data Engineer Professional certification exam☆127Updated last month
- This project is for demonstrating knowledge of Data Engineering tools and concepts and also learning in the process☆46Updated 2 years ago
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆271Updated last year
- Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow☆149Updated 5 years ago
- Near real time ETL to populate a dashboard.☆72Updated last year
- Delta-Lake, ETL, Spark, Airflow☆47Updated 2 years ago
- ☆88Updated 2 years ago
- Resources for video demonstrations and blog posts related to DataOps on AWS☆181Updated 3 years ago
- End to end data engineering project☆57Updated 2 years ago
- Ravi Azure ADB ADF Repository☆65Updated 6 months ago
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆47Updated 6 months ago
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆68Updated last year
- Python data repo, jupyter notebook, python scripts and data.☆519Updated 8 months ago
- Data Engineering with AWS, Published by Packt☆328Updated 2 years ago
- Apache Spark 3 - Structured Streaming Course Material☆121Updated last year
- ☆40Updated 2 years ago
- Local Environment to Practice Data Engineering☆143Updated 7 months ago
- The resources of the preparation course for Databricks Data Engineer Associate certification exam☆463Updated last month
- ☆152Updated 3 years ago
- Data pipeline that scrapes Rust cheater Steam profiles☆52Updated 3 years ago
- Project for "Data pipeline design patterns" blog.☆45Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆141Updated 2 years ago
- Big Data Engineering practice project, including ETL with Airflow and Spark using AWS S3 and EMR☆84Updated 6 years ago
- This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.☆102Updated last year
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 5 years ago
- Udacity Data Engineering Nanodegree Capstone Project☆36Updated 5 years ago