liliasfaxi / Atelier-SparkLinks
Cours et TP sur Apache Spark
☆11Updated 3 years ago
Alternatives and similar repositories for Atelier-Spark
Users that are interested in Atelier-Spark are comparing it to the libraries listed below
Sorting:
- The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Pos…☆72Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆48Updated last year
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆63Updated 2 years ago
- Repository for all ITVersity Vagrant Boxes.☆32Updated 5 years ago
- ☆15Updated 3 years ago
- used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline☆29Updated last year
- Tutorial for setting up a Spark cluster running inside of Docker containers located on different machines☆134Updated 2 years ago
- Spark data pipeline that processes movie ratings data.☆30Updated last week
- Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆57Updated 2 years ago
- Dockerizing an Apache Spark Standalone Cluster☆43Updated 3 years ago
- Django-based course management platform for Zoomcamps☆69Updated this week
- This project shows how to capture changes from postgres database and stream them into kafka☆38Updated last year
- ☆26Updated last year
- ☆88Updated 3 years ago
- This is a study guide preparation to achive the CDP Administrator Private Cloud Base Exam (CDP-2001)☆15Updated 2 years ago
- PySpark Cheatsheet☆36Updated 2 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Updated 4 years ago
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 5 years ago
- A Series of Notebooks on how to start with Kafka and Python☆152Updated 7 months ago
- This repository contains the code for a realtime election voting system. The system is built using Python, Kafka, Spark Streaming, Postgr…☆41Updated last year
- Guide for databricks spark certification☆58Updated 4 years ago
- This project focuses on building a robust data pipeline using Apache Airflow to automate the ingestion of weather data from the OpenWeath…☆22Updated 2 years ago
- This contain how to install Hadoop on google colab and how to run map-reduce in Hadoop☆33Updated 5 years ago
- EverythingApacheNiFi☆115Updated last year
- Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.☆37Updated 2 years ago
- Azure Deployments using Terraform☆30Updated 2 years ago
- Orchestrate Spark Jobs from Kubeflow Pipelines and poll for the status.☆52Updated 3 years ago
- Scraping my school's alumni Data from LinkedIn using a bot 🤖☆24Updated 4 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 5 years ago
- Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)☆321Updated last year