reisdebora / awesome-databricks
A curated list of awesome Databricks resources, including Spark
☆17Updated 8 months ago
Alternatives and similar repositories for awesome-databricks:
Users that are interested in awesome-databricks are comparing it to the libraries listed below
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- Full stack data engineering tools and infrastructure set-up☆50Updated 4 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- Delta Lake Documentation☆49Updated 8 months ago
- Optimizing Databricks Workload, published by Packt☆17Updated 2 years ago
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated last year
- Spark app to merge different schemas☆23Updated 4 years ago
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Utility functions for dbt projects running on Spark☆31Updated last month
- Spark and Delta Lake Workshop☆22Updated 2 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Updated 3 years ago
- Examples surrounding Databricks.☆57Updated 8 months ago
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Apache Spark Guide☆31Updated 3 years ago
- AWS Big Data Certification☆25Updated 2 months ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- ☆11Updated 3 years ago
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- Awesome content all about Azure Databricks☆16Updated 3 years ago
- Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and mo…☆25Updated 3 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆27Updated 2 years ago
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 4 years ago
- Collection of Databricks and Jupyter Notebooks☆21Updated last year
- Spark data pipeline that processes movie ratings data.☆28Updated last month
- Pyspark boilerplate for running prod ready data pipeline☆28Updated 3 years ago
- Code snippets for Data Engineering Design Patterns book☆74Updated last month
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- Analytics engineering with dbt - projects and developer environment☆16Updated 5 months ago