dagster-io / dagster
An orchestration platform for the development, production, and observation of data assets.
☆11,711Updated this week
Related projects ⓘ
Alternatives and complementary repositories for dagster
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.☆17,507Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆9,976Updated this week
- DuckDB is an analytical in-process SQL database management system☆24,380Updated this week
- Always know what to expect from your data.☆9,997Updated this week
- 🧙 Build, run, and manage data pipelines for integrating and transforming data.☆7,971Updated this week
- the portable Python dataframe library☆5,318Updated this week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.☆5,787Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆37,168Updated this week
- data load tool (dlt) is an open source Python library that makes data loading easy 🛠️☆2,662Updated this week
- Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to wr…☆1,851Updated this week
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,440Updated last week
- Build data pipelines, the easy way 🛠️☆4,080Updated last year
- Efficient data transformation and modeling framework that is backwards compatible with dbt.☆1,824Updated this week
- The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lak…☆16,219Updated this week
- Parallel computing with task scheduling☆12,604Updated this week
- Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems☆8,256Updated this week
- Postgres with GPUs for ML/AI apps.☆6,038Updated last week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆1,913Updated this week
- Python SQL Parser and Transpiler☆6,755Updated this week
- Dolt – Git for Data☆17,974Updated this week
- lakeFS - Data version control for your data lake | Git for data☆4,459Updated this week
- Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown☆4,450Updated this week
- Compare tables within or across databases☆2,945Updated 6 months ago
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆14,615Updated this week
- Utils for streaming large files (S3, HDFS, gzip, bz2...)☆3,215Updated 3 weeks ago
- Self-serve BI to 10x your data team ⚡️☆3,999Updated this week
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,013Updated last month
- Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.☆1,727Updated this week
- Distributed data engine for Python/SQL designed for the cloud, powered by Rust☆2,336Updated this week
- Malloy is an experimental language for describing data relationships and transformations.☆1,996Updated this week