cmdviegas / hadoop-sparkLinks
This is a script to deploy a cluster with Apache Hadoop and Apache Spark + Apache Hive in distributed mode using Docker as infrastructure.
☆27Updated 3 months ago
Alternatives and similar repositories for hadoop-spark
Users that are interested in hadoop-spark are comparing it to the libraries listed below
Sorting:
- Big Data Ecosystem Docker☆424Updated 2 years ago
- ☆44Updated 3 years ago
- ☆71Updated 2 years ago
- Data Engineering made simple - An opinionated Data Engineering framework☆65Updated last year
- Apply for a job at Olist's Data Team: https://olist.gupy.io/☆51Updated 3 years ago
- Repository to place/show my python apps☆20Updated 3 years ago
- Projeto da palestra apresentada no GDG DevFest Cerrado 2019 e TDC BH 2020☆33Updated 5 years ago
- Projeto de construção de datalake do zero☆99Updated last year
- ☆40Updated last year
- ☆33Updated 4 years ago
- ☆65Updated last year
- Estudo e implementação dos principais algoritmos de Machine Learning em Jupyter Notebooks.☆223Updated last week
- The One Billion Row Challenge using Python☆82Updated last year
- ☆23Updated 4 years ago
- ☆144Updated last year
- Projeto de simulação de ingestão, tratamento e analise de dados do Ministério da Cultura☆47Updated last year
- Big Data Ecosystem Docker☆80Updated 3 years ago
- Estudos e projetos.☆62Updated 3 years ago
- Spyrk-cluster is a data mini-lab, considering the main technologies used these days. It's useful to either understand how to configure a …☆29Updated 4 years ago
- ☆196Updated 2 years ago
- My data science Docker image.☆17Updated 3 months ago
- Repositório com um tutorial simples e claro de Polars, biblioteca de análise de dados no Python, uma alternativa ao Pandas.☆40Updated 2 years ago
- Análise de Dados Abertos da Prova Brasil 2011 com Airflow, S3, Redshift e Metabase☆16Updated 2 years ago
- Projeto de Machine Learning do início ao fim no contexto de um e-commerce☆232Updated last year
- This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of t…☆107Updated 3 years ago
- Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal☆173Updated 2 years ago
- Repositório para armazenamento de código e notebooks de postagens do blog e cursos.☆333Updated 2 years ago
- Personal roadmap to guide my studies.☆80Updated 3 years ago
- This repo contains all the cheatsheets you need to keep handy, I will add more soon.☆41Updated 2 years ago
- Desafio para Engenheiro(a) de Dados - VAGAS.com☆30Updated 6 years ago