pditommaso / awesome-pipelineLinks
A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin
β6,446Updated last month
Alternatives and similar repositories for awesome-pipeline
Users that are interested in awesome-pipeline are comparing it to the libraries listed below
Sorting:
- A curated list of awesome ETL frameworks, libraries, and software.β3,459Updated last year
- Repository for the CWL standards. Use https://cwl.discourse.group/ for support πβ1,469Updated 9 months ago
- Curated list of resources about Apache Airflowβ3,832Updated last year
- An orchestration platform for the development, production, and observation of data assets.β14,035Updated this week
- Always know what to expect from your data.β10,769Updated this week
- A next-generation curated knowledge sharing platform for data scientists and other technical professions.β5,527Updated last year
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visβ¦β18,493Updated 4 months ago
- A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflowβ2,082Updated last year
- π Parameterize, execute, and analyze notebooksβ6,265Updated 2 months ago
- Data-Centric Pipelines and Data Versioningβ6,257Updated 7 months ago
- a curated list of awesome streaming frameworks, applications, etcβ2,892Updated last month
- Parallel computing with task schedulingβ13,489Updated this week
- Prefect is a workflow orchestration framework for building resilient data pipelines in Python.β20,376Updated this week
- Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per sβ¦β8,431Updated 2 weeks ago
- Actively curated list of awesome BI tools. PRs welcome!β2,213Updated last year
- ETL best practices with airflow, with examplesβ1,342Updated 11 months ago
- Python Extract Transform and Load Tables of Dataβ1,289Updated last month
- Docker Apache Airflowβ3,811Updated 2 years ago
- the portable Python dataframe libraryβ6,101Updated this week
- The fastest β‘οΈ way to build data pipelines. Develop iteratively, deploy anywhere. βοΈβ3,601Updated 3 months ago
- A series of DAGs/Workflows to help maintain the operation of Airflowβ1,744Updated last year
- dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applicationβ¦β11,393Updated this week
- Agile Data Preparation Workflows madeΒ easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySparkβ1,517Updated 9 months ago
- MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycleβ3,676Updated last week
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.ioβ2,174Updated this week
- A DSL for data-driven computational pipelinesβ3,148Updated this week
- Extract Transform Load for Python 3.5+β1,600Updated 2 years ago
- A light-weight, flexible, and expressive statistical data testing libraryβ4,016Updated 2 weeks ago
- Quilt is a data mesh for connecting people with actionable dataβ1,346Updated this week
- Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering andβ¦β10,535Updated this week