reisdebora / awesome-databricks
A curated list of awesome Databricks resources, including Spark
☆18Updated 10 months ago
Alternatives and similar repositories for awesome-databricks:
Users that are interested in awesome-databricks are comparing it to the libraries listed below
- Awesome content all about Azure Databricks☆16Updated 3 years ago
- Full stack data engineering tools and infrastructure set-up☆52Updated 4 years ago
- AWS Big Data Certification☆25Updated 3 months ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆54Updated 2 years ago
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- Utility functions for dbt projects running on Spark☆33Updated 2 months ago
- Delta Lake Documentation☆49Updated 10 months ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- Examples for High Performance Spark☆15Updated 6 months ago
- Curated list of resources about Apache Airflow☆19Updated 4 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆27Updated 2 years ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- Code Repository for GCP: Complete Google Data Engineer and Cloud Architect Guide(v), Published by Packt☆16Updated 2 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Apache Spark Guide☆31Updated 3 years ago
- Demo for GitHub Universe 2022☆12Updated 2 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- Data Engineering with Spark and Delta Lake☆98Updated 2 years ago
- Code snippets for Data Engineering Design Patterns book☆101Updated last month
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenario…☆23Updated 5 months ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Snowflake Data Engineering in Action☆15Updated 6 months ago
- Yet Another (Spark) ETL Framework☆21Updated last year
- Build DataOps platform with Apache Airflow and dbt on AWS☆55Updated 3 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- All the Snowflake Virtual Warehouse - Example☆12Updated 4 years ago
- ☆21Updated 3 years ago
- Code for my "Efficient Data Processing in SQL" book.☆56Updated 8 months ago