Spark development environment for kubernetes, spark-submit and jupyter notebook
☆19Nov 30, 2021Updated 4 years ago
Alternatives and similar repositories for spark-dev-env-docker
Users that are interested in spark-dev-env-docker are comparing it to the libraries listed below
Sorting:
- This repo provides the Kubernetes Helm chart for deploying Pyspark Notebook.☆17Nov 16, 2022Updated 3 years ago
- ☆13Jun 7, 2022Updated 3 years ago
- Complete data engineering pipeline running on Minikube Kubernetes, Argo CD, Spark, Trino, S3, Delta lake, Postgres+ Debezium CDC, MySQL,…☆28May 19, 2025Updated 9 months ago
- ☆59Mar 3, 2024Updated 2 years ago
- ☆17Oct 20, 2020Updated 5 years ago
- Spark Code Examples☆14Dec 3, 2021Updated 4 years ago
- This repo contains all the cheatsheets you need to keep handy, I will add more soon.☆42Nov 10, 2022Updated 3 years ago
- ☆32Aug 18, 2021Updated 4 years ago
- Projeto da palestra apresentada no GDG DevFest Cerrado 2019 e TDC BH 2020☆34May 12, 2020Updated 5 years ago
- Exercícios do módulo 4 - Bootcamp EDC - IGTI☆10May 5, 2022Updated 3 years ago
- Python tool to help export Azure DevOps WIKI into a single PDF☆10May 10, 2020Updated 5 years ago
- Automated basic infrastructure to intall OKD4 on free ESXi☆13Aug 8, 2020Updated 5 years ago
- hackintosh 13900 macos☆12Jun 23, 2023Updated 2 years ago
- Proof of concept of a big data cluster using open source tools☆11Apr 10, 2024Updated last year
- This repository demonstrates the Outbox Pattern in microservices, leveraging the Django Outbox Pattern library developed at @juntossomosm…☆15Feb 5, 2024Updated 2 years ago
- Trying out the Dataframe Polars library with Delta Lake ... feat Python.☆12Jan 29, 2025Updated last year
- CLI tool to manage Kafka connectors☆10Mar 2, 2024Updated 2 years ago
- ☆13Apr 15, 2023Updated 2 years ago
- Tooling to build a custom Confluent Platform Kafka Connect container with additional connectors from Confluent Hub.☆15Oct 26, 2020Updated 5 years ago
- Projeto Stack de dados OSS☆12Apr 8, 2025Updated 10 months ago
- Glue VSCode devcontainer setup☆14Jan 31, 2023Updated 3 years ago
- ETL with Azure Cookbook, published by Packt☆12Jan 18, 2023Updated 3 years ago
- In this article, you will learn how to set up a real-time data processing and analytics environment using Docker, MySQL, Redpanda, MinIO,…☆11Jun 27, 2023Updated 2 years ago
- Capstone Project: Predicting default in P2P lending☆12Feb 27, 2017Updated 9 years ago
- A python web scrap and data analytics project used to identify key metrics and BI insights about Brazilian Real Estate Investment Fund (a…☆14Aug 26, 2020Updated 5 years ago
- Spending One Hundred days on blogging about cloud computing☆14Jul 12, 2022Updated 3 years ago
- Playbook to provision a Confluent Cluster☆10Oct 22, 2017Updated 8 years ago
- The official repository of the Akka Typed Essentials course with Scala☆12May 13, 2024Updated last year
- Utilizando o GitHub para expor seus projetos de Data Science - Materiais☆17Apr 27, 2021Updated 4 years ago
- Script para ingestão de dados do Mercado Bitcoin☆11Jun 29, 2023Updated 2 years ago
- Airflow Examples: code samples for Medium articles☆14Jan 10, 2021Updated 5 years ago
- reating a modern data pipeline using a combination of Terraform, AWS Lambda and S3, Snowflake, DBT, Mage AI, and Dash.☆14Jun 26, 2023Updated 2 years ago
- ☆18Sep 17, 2021Updated 4 years ago
- Spark in Action, 2nd edition - chapter 15 - Aggregating your data☆12Sep 8, 2022Updated 3 years ago
- A process manager written in C++ and Rust.☆14Oct 26, 2022Updated 3 years ago
- Notas das aulas da Aceleração Dev #4 da DIO sobre Engenharia de Dados, ministrado pela Everis.☆13Feb 6, 2021Updated 5 years ago
- Instalador autonomo do Apache Spark para Sistemas linux: based(Debian,RHEL)☆13Dec 10, 2024Updated last year
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- Integração de dados em tempo real com Debezium, Postgres e S3 com deploy utilizando o Terraform.☆13Mar 23, 2022Updated 3 years ago