monksy / awesome-data-engineeringLinks
A curated list of data engineering tools for software developers
☆10Updated 6 years ago
Alternatives and similar repositories for awesome-data-engineering
Users that are interested in awesome-data-engineering are comparing it to the libraries listed below
Sorting:
- https://www.packtpub.com/books/info/authors/tomasz-lelek☆12Updated 3 years ago
- A curated list of awesome Databricks resources, including Spark☆19Updated 11 months ago
- Example project using DBT, Databricks and AdventureWorks sample database☆12Updated 2 years ago
- Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered…☆16Updated 6 years ago
- Apache Kafka Guide☆31Updated 3 years ago
- Supplementary material for Building a Modern Data Platform with Snowflake, from Pearson.☆20Updated 3 years ago
- Showcase of some basic Kafka concepts and their integration with Spring Boot☆9Updated 4 years ago
- Apache Spark Interview Question and Answers☆21Updated 4 years ago
- Connect DBVisualizer to Hortonwork HiveServer2☆9Updated 10 years ago
- Effective Kafka☆54Updated 3 years ago
- Sample Airflow DAGs to load data from the CovidTracking API to Snowflake via an AWS S3 intermediary.☆16Updated 4 years ago
- Hadoop/Hive/Spark container to perform CI tests☆11Updated 4 years ago
- Recommender System (Java, Apache Spark)☆9Updated 6 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- Master complex big data processing, stream analytics, and machine learning with Apache Spark☆18Updated 2 years ago
- Example code from my Futures and Observables presentation☆21Updated 11 years ago
- Selected resources for SRE/DevOps professionals covering various Computer Science areas: Software Engineering & Architecture, Operations,…☆23Updated 7 years ago
- Fundamentals of Apache Flink [video], published by Packt☆12Updated 2 years ago
- Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect☆13Updated 7 months ago
- Udacity Data Engineering Nano Degree Project, Data Modeling for fact and dimension tables, and ETL pipeline that transfers data from file…☆9Updated 4 years ago
- Repo for Data Warehouse Concepts, Design, and Data Integration by University of Colorado System (coursera)(Notes,Assignments, quiz and r…☆44Updated 7 years ago
- Apache Kafka 1.0 Cookbook, published by Packt☆21Updated 2 years ago
- KSQL Step-by-step tutorial using the basic functions of Apache Kafka's Streaming SQL Engine☆10Updated 5 years ago
- ☆11Updated last year
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Pipeline library for StreamSets Data Collector and Transformer☆33Updated 2 years ago
- Labs and data files for a full-day Spark workshop☆24Updated last week
- AWS Big Data Certification☆25Updated 4 months ago
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 4 years ago
- Code for the fictitious food delivery company GottaEat used in the Pulsar In Action book☆18Updated 3 years ago