apache / airflowLinks
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
☆43,457Updated this week
Alternatives and similar repositories for airflow
Users that are interested in airflow are comparing it to the libraries listed below
Sorting:
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,594Updated 6 months ago
- Docker Apache Airflow☆3,809Updated 2 years ago
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.☆21,053Updated this week
- Apache Druid: a high performance real-time analytics database.☆13,890Updated this week
- The official home of the Presto distributed SQL query engine for big data☆16,582Updated last week
- Apache Superset is a Data Visualization and Data Exploration Platform☆69,353Updated this week
- An orchestration platform for the development, production, and observation of data assets.☆14,582Updated this week
- Parallel computing with task scheduling☆13,648Updated this week
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆8,395Updated this week
- The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lak…☆20,216Updated this week
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build application…☆11,965Updated this week
- Apache Spark - A unified analytics engine for large-scale data processing☆42,445Updated this week
- Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.☆28,047Updated last week
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,451Updated this week
- Curated list of resources about Apache Airflow☆3,860Updated last year
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆16,241Updated this week
- Always know what to expect from your data.☆10,992Updated this week
- Apache Pinot - A realtime distributed OLAP datastore☆5,978Updated this week
- Apache Iceberg☆8,292Updated this week
- Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)☆12,250Updated last week
- The open source developer platform to build AI agents and models with confidence. Enhance your AI applications with end-to-end tracking, …☆23,232Updated this week
- The Metadata Platform for your Data and AI Stack☆11,288Updated this week
- Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.☆6,586Updated last week
- Data-Centric Pipelines and Data Versioning☆6,274Updated 10 months ago
- Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies A…☆47,265Updated this week
- Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects,…☆47,300Updated this week
- Machine Learning Toolkit for Kubernetes☆15,334Updated last month
- Production-Grade Container Scheduling and Management☆119,102Updated last week
- Distributed reliable key-value store for the most critical data of a distributed system☆50,964Updated this week
- the portable Python dataframe library☆6,262Updated last week