cherrera20 / spark-ecosystem-clusterLinks
Fully operational local setup to experiment with a Spark-based ecosystem
☆23Updated 4 months ago
Alternatives and similar repositories for spark-ecosystem-cluster
Users that are interested in spark-ecosystem-cluster are comparing it to the libraries listed below
Sorting:
- Local Environment to Practice Data Engineering☆143Updated 6 months ago
- ☆11Updated last year
- Grupo de Estudios de Apache Spark organizado por la comunidad Data Engineering Latam☆37Updated last year
- Curso de Introducción a la Inteligencia Artificial Generativa con Modelos de Gran Tamaño☆25Updated last month
- This repository will contain all of the resources for the Mage component of the Data Engineering Zoomcamp: https://github.com/DataTalksCl…☆99Updated 10 months ago
- Project for "Data pipeline design patterns" blog.☆45Updated 11 months ago
- ☆142Updated 2 years ago
- This repository helps teach people how to correctly define and create cumulative tables!☆703Updated 8 months ago
- Code for "Efficient Data Processing in Spark" Course☆325Updated last month
- Recursos básicos para iniciarse en web scraping usando Python.☆62Updated last year
- Airflow 3 demos from DevRel☆69Updated last month
- Code snippets for Data Engineering Design Patterns book☆128Updated 3 months ago
- Curso Introductorio de Spark by Platzi 💚☆111Updated 4 years ago
- ☆825Updated 2 months ago
- Codigo para el curso Ingenieria de Variables para Machine Learning☆18Updated 10 months ago
- This repo has all the resources you need to become an amazing analytics engineer!☆232Updated last year
- Repositorio utilizado para el Curso de Apache Spark en Platzi☆19Updated 4 years ago
- Python data repo, jupyter notebook, python scripts and data.☆518Updated 7 months ago
- A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.☆74Updated last year
- Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆139Updated last year
- In this repository we store all materials for dlt workshops, courses, etc.☆204Updated last week
- Repositorio de estudio para preparar la certificación AWS Cloud Practitioner☆16Updated last month
- MLOps Workshop using Weights and Bias (Wandb) and Github Actions.☆49Updated 3 months ago
- This project contain build end-to-end e-commerce data from data source into data warehouse and visualization.☆13Updated 10 months ago
- Generate synthetic Spotify music stream dataset to create dashboards. Spotify API generates fake event data emitted to Kafka. Spark consu…☆68Updated last year
- A template repository to create a data project with IAC, CI/CD, Data migrations, & testing☆268Updated last year
- Practical Data Engineering: A Hands-On Real-Estate Project Guide☆669Updated 10 months ago
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆46Updated 6 months ago
- Google Cloud Certified Cloud Digital Leader - Foundational☆9Updated last month
- Code for "Advanced data transformations in SQL" free live workshop☆83Updated 2 months ago