pawl / awesome-etl
A curated list of awesome ETL frameworks, libraries, and software.
☆3,287Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for awesome-etl
- ETL best practices with airflow, with examples☆1,295Updated last month
- Curated list of resources about Apache Airflow☆3,691Updated 3 months ago
- A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow☆2,081Updated 11 months ago
- Actively curated list of awesome BI tools. PRs welcome!☆2,097Updated 3 months ago
- Python Extract Transform and Load Tables of Data☆1,250Updated 6 months ago
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,440Updated last week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,204Updated last month
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆9,976Updated this week
- Extract Transform Load for Python 3.5+☆1,588Updated last year
- A curated list of data engineering tools for software developers☆6,828Updated 3 weeks ago
- Docker Apache Airflow☆3,776Updated last year
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,913Updated this week
- This repository is a getting started guide to Singer.☆1,272Updated 2 months ago
- re_data - fix data issues before your users & CEO would discover them 😊☆1,552Updated 6 months ago
- A curated list of awesome Apache Spark packages and resources.☆1,722Updated 3 weeks ago
- A series of DAGs/Workflows to help maintain the operation of Airflow☆1,681Updated 5 months ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,481Updated this week
- Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.☆1,956Updated this week
- Official repository for pygrametl - ETL programming in Python☆290Updated 3 weeks ago
- A curated list of data engineering tools for software developers☆462Updated 7 years ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,309Updated last month
- Dynamically generate Apache Airflow DAGs from YAML configuration files☆1,209Updated this week
- Dremio - the missing link in modern data☆1,381Updated 3 weeks ago
- Guides and docs to help you get up and running with Apache Airflow.☆800Updated 2 years ago
- Collect, aggregate, and visualize a data ecosystem's metadata☆1,781Updated last week
- ☆1,610Updated last week
- The leader in Next-Generation Customer Data Infrastructure☆6,849Updated 2 months ago
- [NOT MAINTAINED] Bubbles – Python ETL framework☆452Updated 7 years ago
- Data-Centric Pipelines and Data Versioning☆6,181Updated this week
- Web-based SQL editor. Legacy project in maintenance mode.☆5,057Updated 2 weeks ago