davlum / localemrLinks
Local AWS EMR - A local service that imitates AWS EMR
☆27Updated 2 years ago
Alternatives and similar repositories for localemr
Users that are interested in localemr are comparing it to the libraries listed below
Sorting:
- Delta Lake helper methods. No Spark dependency.☆23Updated last year
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆149Updated this week
- Pylint plugin for static code analysis on Airflow code☆96Updated 5 years ago
- Enforce Best Practices for all your Airflow DAGs. ⭐☆104Updated this week
- Great Expectations Airflow operator☆167Updated last week
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆96Updated last month
- The shared semantic layer definitions that dbt-core and MetricFlow use.☆87Updated 2 weeks ago
- A Python Library to support running data quality rules while the spark job is running⚡☆190Updated this week
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆220Updated 3 weeks ago
- A repository of sample code to show data quality checking best practices using Airflow.☆78Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆201Updated 2 months ago
- Utility functions for dbt projects running on Spark☆33Updated 8 months ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆259Updated last year
- Making DAG construction easier☆276Updated last month
- PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows☆43Updated 2 weeks ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆76Updated 4 years ago
- ☆81Updated 8 months ago
- A write-audit-publish implementation on a data lake without the JVM☆45Updated last year
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆169Updated 2 years ago
- Read Delta tables without any Spark☆47Updated last year
- This repository contains the dbt-glue adapter☆135Updated last week
- Make dbt docs and Apache Superset talk to one another☆151Updated last month
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆223Updated 6 months ago
- Pipeline definitions for managing data flows to power analytics at MIT Open Learning☆44Updated this week
- Weekly Data Engineering Newsletter☆96Updated last year
- ☆49Updated last year
- pytest plugin to run the tests with support of pyspark☆87Updated 5 months ago
- ☆42Updated 4 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆231Updated 3 months ago