apache / airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆39,103Updated this week
Alternatives and similar repositories for airflow:
Users that are interested in airflow are comparing it to the libraries listed below
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,145Updated last month
- Distributed Task Queue (development branch)☆25,765Updated this week
- Curated list of resources about Apache Airflow☆3,748Updated 6 months ago
- Docker Apache Airflow☆3,798Updated 2 years ago
- ClickHouse® is a real-time analytics database management system☆39,429Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆7,858Updated this week
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.☆18,553Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆10,481Updated this week
- The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data☆41,129Updated this week
- Parallel computing with task scheduling☆13,002Updated this week
- Machine Learning Toolkit for Kubernetes☆14,735Updated 2 weeks ago
- Apache Spark - A unified analytics engine for large-scale data processing☆40,704Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,237Updated this week
- The Metadata Platform for your Data and AI Stack☆10,389Updated this week
- Apache Superset is a Data Visualization and Data Exploration Platform☆64,888Updated this week
- An orchestration platform for the development, production, and observation of data assets.☆12,681Updated this week
- MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.☆50,689Updated this week
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,520Updated this week
- A high-performance observability data pipeline.☆18,930Updated this week
- Apache Druid: a high performance real-time analytics database.☆13,631Updated this week
- Python Development Workflow for Humans.☆25,000Updated this week
- A time-series database for high-performance real-time analytics packaged as a Postgres extension☆18,527Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.☆27,045Updated this week
- Data-Centric Pipelines and Data Versioning☆6,208Updated last month
- Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.☆27,454Updated last week
- Open source platform for the machine learning lifecycle☆19,714Updated this week
- Apache Iceberg☆7,008Updated this week
- Build, Deploy and Manage AI/ML Systems☆8,612Updated this week
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.☆6,077Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,299Updated this week