monksy / awesome-data-engineeringLinks
A curated list of data engineering tools for software developers
☆12Updated 6 years ago
Alternatives and similar repositories for awesome-data-engineering
Users that are interested in awesome-data-engineering are comparing it to the libraries listed below
Sorting:
- Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collab…☆39Updated 5 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 5 years ago
- A collection of kafka-resources☆211Updated this week
- Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution☆67Updated 5 years ago
- A batch Data Pipeline that retrieves data from a user purchase table and a movie review table and is transformed to form a user behaviour…☆18Updated 2 months ago
- If you are planning or preparing for Apache Kafka Certification then this is the right place for you.There are many Apache Kafka Certific…☆40Updated 5 years ago
- Data engineering interviews Q&A for data community by data community☆64Updated 5 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Updated 6 years ago
- Apache Spark Interview Question and Answers☆21Updated 5 years ago
- Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.☆57Updated 2 years ago
- This project will help the beginners learn Kafka with ease.☆48Updated 2 years ago
- PySpark Cheatsheet☆36Updated 2 years ago
- Data Vault 2.0: Code generation, Vertica, Airflow☆11Updated 5 years ago
- ☆20Updated 6 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- ☆117Updated 5 years ago
- Python and AirFlow - Data Pipeline Orchestration☆16Updated 2 years ago
- Course Material☆25Updated 2 years ago
- ☆88Updated 3 years ago
- Everything about Apache Kafka☆206Updated 2 years ago
- Stream Processing Workshop☆23Updated last month
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆102Updated last month
- Complete high-quality practice tests of 50 questions each will help you master your Confluent Certified Developer for Apache Kafka (CCDAK…☆91Updated 2 years ago
- ETL pipeline using pyspark (Spark - Python)☆116Updated 5 years ago
- Different ways to process data into Cassandra in realtime with technologies such as Kafka, Spark, Akka, Flink☆31Updated 3 years ago
- Apache Spark Course Material☆95Updated 2 years ago
- ☆44Updated 5 years ago
- Learn the Confluent Schema Registry & REST Proxy☆193Updated last year
- A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Arc…☆184Updated last month
- This repos is to keep all the relevant informations for Confluent Certified Developer for Apache Kafka (CCDAK)☆123Updated last year