reisdebora / awesome-databricks
A curated list of awesome Databricks resources, including Spark
☆16Updated 7 months ago
Alternatives and similar repositories for awesome-databricks:
Users that are interested in awesome-databricks are comparing it to the libraries listed below
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Full stack data engineering tools and infrastructure set-up☆48Updated 4 years ago
- Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validatio…☆53Updated last year
- AWS Big Data Certification☆25Updated last month
- Awesome content all about Azure Databricks☆16Updated 3 years ago
- Road to Azure Data Engineer Part-II: DP-201 - Designing an Azure Data Solution☆19Updated 4 years ago
- Delta Lake Documentation☆48Updated 7 months ago
- Data engineering interviews Q&A for data community by data community☆63Updated 4 years ago
- Spark app to merge different schemas☆23Updated 4 years ago
- ☆30Updated 10 months ago
- Metadata Driven Development (m3d) is a cloud and platform agnostic framework for the automated creation, management and governance of dat…☆31Updated last year
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆26Updated 2 years ago
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- Curated list of resources about Apache Airflow☆19Updated 3 years ago
- Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects☆42Updated 2 months ago
- Yet Another (Spark) ETL Framework☆18Updated last year
- A curated list of data engineering tools for software developers☆10Updated 6 years ago
- Road to Azure Data Engineer Part-I: DP-200 - Implementing an Azure Data Solution☆66Updated 4 years ago
- Snowflake demo for Financial Services☆19Updated 10 months ago
- Spark and Delta Lake Workshop☆22Updated 2 years ago
- A Pyspark job to handle upserts, conversion to parquet and create partitions on S3☆26Updated 4 years ago
- ☆17Updated 6 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- Skeleton project for Apache Airflow training participants to work on.☆16Updated 4 years ago
- Code snippets for Data Engineering Design Patterns book☆68Updated last week
- Public source code for the Batch Processing with Apache Beam (Python) online course☆18Updated 4 years ago
- Data Engineering with Spark and Delta Lake☆94Updated 2 years ago
- Complete Repository to become an expert is SQL Window Functions☆25Updated 10 months ago
- This is a real-life, high throughput streaming ELT data pipeline for ecommerce☆13Updated last year
- Analytics engineering with dbt - projects and developer environment☆16Updated 4 months ago