INSATunisia / TP-BigDataLinks
☆19Updated last year
Alternatives and similar repositories for TP-BigData
Users that are interested in TP-BigData are comparing it to the libraries listed below
Sorting:
- An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka…☆287Updated 8 months ago
 - This is a study guide preparation to achive the CDP Administrator Private Cloud Base Exam (CDP-2001)☆14Updated 2 years ago
 - Cours et TP sur Apache Spark☆11Updated 3 years ago
 - IBM Data Engineering Courses from Coursera☆71Updated 2 years ago
 - Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog☆13Updated 2 years ago
 - Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆22Updated 2 years ago
 - The goal of this project is to build a docker cluster that gives access to Hadoop, HDFS, Hive, PySpark, Sqoop, Airflow, Kafka, Flume, Pos…☆74Updated 2 years ago
 - Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collab…☆39Updated 5 years ago
 - This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data wareh…☆160Updated 2 years ago
 - This project demonstrates how to use Apache Airflow to submit jobs to Apache spark cluster in different programming laguages using Python…☆46Updated last year
 - Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand☆56Updated 2 years ago
 - ☆25Updated last year
 - Source code of the Apache Airflow Tutorial for Beginners on YouTube Channel Coder2j (https://www.youtube.com/c/coder2j)☆326Updated last year
 - All of my individual learning materials, documents, and notes from the process of getting the Coursera IBM Data Engineer Professional Cer…☆105Updated 2 years ago
 - ☆12Updated 3 months ago
 - Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra☆143Updated 2 years ago
 - ☆16Updated 3 years ago
 - An awesome Analytics Engineering repository to learn and apply for real world problems.☆38Updated 2 years ago
 - Data Engineering Bootcamp☆30Updated 2 months ago
 - Spark all the ETL Pipelines☆35Updated 2 years ago
 - End to end data engineering project with kafka, airflow, spark, postgres and docker.☆103Updated 7 months ago
 - Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆64Updated 2 years ago
 - Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆57Updated 2 years ago
 - ☆105Updated 2 years ago
 - Sample repo for startdataengineering DE 101 free course☆69Updated last year
 - Learn the entire ETL process based on Spotify API data☆261Updated 4 years ago
 - ☆88Updated 3 years ago
 - Apache Spark 3 - Structured Streaming Course Material☆124Updated 2 years ago
 - My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggrega…☆505Updated 3 years ago
 - Code base for airflow training series Getting easy with Apache Airflow☆41Updated 2 years ago