lrmendess / open-source-datalakeLinks
Proof of concept of a big data cluster using open source tools
☆11Updated last year
Alternatives and similar repositories for open-source-datalake
Users that are interested in open-source-datalake are comparing it to the libraries listed below
Sorting:
- Data Engineering com Apache Spark☆41Updated 4 years ago
- ☆23Updated 2 years ago
- ☆59Updated last year
- Spark development environment for kubernetes, spark-submit and jupyter notebook☆19Updated 4 years ago
- ☆32Updated 4 years ago
- ☆41Updated last year
- Modern Data Stack☆62Updated 5 months ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28Updated 8 months ago
- ☆24Updated 2 years ago
- ☆17Updated last year
- This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/dat…☆18Updated 3 years ago
- ☆74Updated 2 years ago
- Hands-on LAB - Databricks SQL☆26Updated last year
- Docker Apache Airflow☆31Updated 4 years ago
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Updated 3 years ago
- Big Data Ecosystem Docker☆426Updated 2 years ago
- Notebooks e dicas sobre Databricks☆28Updated last year
- Repositório dedicado a Workshop de Data Lakehouse com Delta Lake☆17Updated 4 years ago
- Demo DAGs that show how to run dbt Core in Airflow using Cosmos☆67Updated 8 months ago
- Notas das aulas da Aceleração Dev #4 da DIO sobre Engenharia de Dados, ministrado pela Everis.☆13Updated 4 years ago
- Código para workshops Spark com ambiente de desenvolvimento em docker☆27Updated 4 years ago
- Airflow plugins for implementing data pipelines. | Plugins do Airflow para implementação de pipelines de dados.☆49Updated last month
- trino + hive + minio with postgres in docker compose☆27Updated 2 years ago
- Configura containers do Spark (Master, Workers e History Server) + Jupyter☆21Updated last year
- Docker with Airflow and Spark standalone cluster☆262Updated 2 years ago
- This repository contains a Python script that pulls market sentiment data from SentiCrypt API and push it to Stitch Import API.☆34Updated 2 years ago
- Portfolio of projects and studies conducted in data engineering.☆34Updated 11 months ago
- Conteúdo das aulas da turma 6 do bootcamp de engenharia de dados da How☆12Updated 4 years ago
- ☆25Updated last year
- Delta-Lake, ETL, Spark, Airflow☆48Updated 3 years ago