apache / airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆38,453Updated this week
Alternatives and similar repositories for airflow:
Users that are interested in airflow are comparing it to the libraries listed below
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,045Updated last week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.☆26,794Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆10,267Updated this week
- Machine Learning Toolkit for Kubernetes☆14,570Updated 2 months ago
- Apache Superset is a Data Visualization and Data Exploration Platform☆63,976Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆14,902Updated this week
- Docker Apache Airflow☆3,792Updated last year
- Data-Centric Pipelines and Data Versioning☆6,199Updated last week
- ClickHouse® is a real-time analytics database management system☆38,604Updated this week
- Parallel computing with task scheduling☆12,878Updated this week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆7,966Updated this week
- Curated list of resources about Apache Airflow☆3,726Updated 5 months ago
- Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.☆5,954Updated this week
- An orchestration platform for the development, production, and observation of data assets.☆12,376Updated this week
- The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lak…☆17,003Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆5,611Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,176Updated this week
- high-performance graph database for real-time use cases☆20,603Updated this week
- Distributed Task Queue (development branch)☆25,359Updated this week
- Open source platform for the machine learning lifecycle☆19,312Updated this week
- Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.☆35,044Updated this week
- The simplest, fastest way to get business intelligence and analytics to everyone in your company☆39,735Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,253Updated last month
- The Open Source Feature Store for Machine Learning☆5,752Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆7,770Updated this week
- Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting…☆4,484Updated 2 weeks ago
- DuckDB is an analytical in-process SQL database management system☆26,030Updated this week
- MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.☆49,731Updated this week
- 🦉 Data Versioning and ML Experiments☆14,110Updated last week
- Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.☆6,450Updated last week