cherrera20 / spark-ecosystem-cluster
Fully operational local setup to experiment with a Spark-based ecosystem
☆21Updated last month
Alternatives and similar repositories for spark-ecosystem-cluster:
Users that are interested in spark-ecosystem-cluster are comparing it to the libraries listed below
- Grupo de Estudios de Apache Spark organizado por la comunidad Data Engineering Latam☆36Updated last year
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆67Updated last year
- Local Environment to Practice Data Engineering☆144Updated 3 months ago
- My notes for AWS Data Engineer Associate☆30Updated 3 months ago
- This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksCl…☆98Updated 7 months ago
- Data Engineering examples for Airflow, Prefect; dbt for BigQuery, Redshift, ClickHouse, Postgres, DuckDB; PySpark for Batch processing; K…☆64Updated last month
- Project for "Data pipeline design patterns" blog.☆45Updated 7 months ago
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆41Updated 2 months ago
- Code for "Efficient Data Processing in Spark" Course☆290Updated 6 months ago
- This project contain build end-to-end e-commerce data from data source into data warehouse and visualization.☆12Updated 6 months ago
- Sample project to demonstrate data engineering best practices☆184Updated last year
- ☆11Updated 9 months ago
- Realtime Data Engineering Project☆27Updated 2 months ago
- This repository helps teach people how to correctly define and create cumulative tables!☆646Updated 5 months ago
- ☆16Updated 2 years ago
- Repositorio utilizado para el Curso de Apache Spark en Platzi☆19Updated 4 years ago
- Sample repo for startdataengineering DE 101 free course☆55Updated 9 months ago
- ☆104Updated 2 years ago
- ☆346Updated last year
- Code snippets for Data Engineering Design Patterns book☆75Updated 2 weeks ago
- ☆33Updated last year
- Recursos básicos para iniciarse en web scraping usando Python.☆61Updated last year
- ☆63Updated 2 months ago
- This repo has all the resources you need to become an amazing analytics engineer!☆181Updated last year
- ☆204Updated 2 months ago
- Near real time ETL to populate a dashboard.☆73Updated 9 months ago
- Code for the "Build Your Own Search Engine" workshop☆81Updated 8 months ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆72Updated 9 months ago
- In this repository we store all materials for dlt workshops, courses, etc.☆131Updated last week
- Streaming data from a transactional database to a data warehouse using Kafka (Confluent Cloud), Snowflake, and PostgreSQL.☆14Updated last year