reisdebora / awesome-databricksLinks
A curated list of awesome Databricks resources, including Spark
☆22Updated last year
Alternatives and similar repositories for awesome-databricks
Users that are interested in awesome-databricks are comparing it to the libraries listed below
Sorting:
- Data engineering interviews Q&A for data community by data community☆64Updated 5 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆55Updated 2 years ago
- Full stack data engineering tools and infrastructure set-up☆57Updated 4 years ago
- Complete Repository to become an expert is SQL Window Functions☆25Updated last year
- Spark app to merge different schemas☆23Updated 4 years ago
- Repository of sample Databricks notebooks☆270Updated last year
- Apache Spark Guide☆34Updated 3 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆102Updated last month
- ETL pipeline using pyspark (Spark - Python)☆116Updated 5 years ago
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Spark and Delta Lake Workshop☆22Updated 3 years ago
- Challenge for those applying to the Software Engineer, Big Data position☆35Updated 14 years ago
- Delta Lake examples☆230Updated last year
- Guide for databricks spark certification☆58Updated 4 years ago
- Educational notes,Hands on problems w/ solutions for hadoop ecosystem☆87Updated 6 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆28Updated 2 years ago
- A curated list of resources about Snowflake☆248Updated last year
- ☆16Updated 6 years ago
- ☆97Updated 2 years ago
- Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation,…☆90Updated 3 years ago
- A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for …☆139Updated 5 years ago
- Weekly Data Engineering Newsletter☆96Updated last year
- data engineering 100 days 🤖 🧲 🦾 | #DE☆40Updated 2 years ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆46Updated 9 months ago
- Minimal deployment of Great Expectations on lambda☆11Updated 5 years ago
- Examples surrounding Databricks.☆60Updated last year
- This repo contains live examples to build Databricks' Lakehouse and recommended best practices from the field.☆22Updated last year
- Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution☆67Updated 5 years ago
- Code samples, etc. for Databricks☆71Updated 5 months ago