apache / airflowLinks
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆43,147Updated this week
Alternatives and similar repositories for airflow
Users that are interested in airflow are comparing it to the libraries listed below
Sorting:
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,557Updated 6 months ago
- An orchestration platform for the development, production, and observation of data assets.☆14,421Updated this week
- Parallel computing with task scheduling☆13,590Updated this week
- Apache Superset is a Data Visualization and Data Exploration Platform☆68,951Updated this week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆8,362Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆16,138Updated this week
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.☆20,803Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,564Updated this week
- Apache Druid: a high performance real-time analytics database.☆13,869Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆11,829Updated this week
- Data-Centric Pipelines and Data Versioning☆6,265Updated 9 months ago
- Build, Manage and Deploy AI/ML Systems☆9,627Updated this week
- Workflow Engine for Kubernetes☆16,174Updated this week
- Always know what to expect from your data.☆10,919Updated this week
- A time-series database for high-performance real-time analytics packaged as a Postgres extension☆20,649Updated this week
- Scalable datastore for metrics, events, and real-time analytics☆30,792Updated last week
- Cloud-native high-performance edge/middle/service proxy☆27,001Updated this week
- Apache Spark - A unified analytics engine for large-scale data processing☆42,266Updated this week
- CNCF Jaeger, a Distributed Tracing Platform☆22,104Updated this week
- Python packaging and dependency management made easy☆34,037Updated this week
- A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin☆6,473Updated 3 weeks ago
- the portable Python dataframe library☆6,205Updated this week
- DuckDB is an analytical in-process SQL database management system☆34,025Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.☆27,955Updated this week
- The Metadata Platform for your Data and AI Stack☆11,201Updated this week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,390Updated this week
- Write scalable load tests in plain Python 🚗💨☆27,070Updated last week
- The Prometheus monitoring system and time series database.☆61,305Updated this week
- A high-performance observability data pipeline.☆20,732Updated this week
- Fluentd: Unified Logging Layer (project under CNCF)☆13,382Updated last week