Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
☆98Mar 19, 2026Updated this week
Alternatives and similar repositories for flowman
Users that are interested in flowman are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A library enabling DAG structuring of data processing programs such as ETLs☆17Dec 13, 2025Updated 3 months ago
- ☆10May 16, 2022Updated 3 years ago
- Create a data mart using Azure Data Factory as ELT / ETL, Azure Synapse as database and Power BI as visualization tool.☆20Apr 20, 2022Updated 3 years ago
- A Python Library to support running data quality rules while the spark job is running⚡☆201Updated this week
- Set of ETL utils for Spark☆15May 4, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Code examples for the Introduction to Kubeflow course☆14Jan 12, 2021Updated 5 years ago
- data-mesh-demo☆13Apr 12, 2022Updated 3 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆587Jan 24, 2024Updated 2 years ago
- Kubernetes operator for Apache Hadoop HDFS used by the Stackable Data Platform☆52Mar 17, 2026Updated last week
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆161Dec 10, 2022Updated 3 years ago
- Observability Python library - Powered by Kensu☆22Oct 15, 2024Updated last year
- ☆12Jul 10, 2022Updated 3 years ago
- JNumberTools is an open-source Java library for solving complex problems in combinatorics and number theory. Whether you're a researcher,…☆13Mar 16, 2026Updated last week
- Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former…☆36Jul 17, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- this repo provides best practice guidance, plan template, solution assessment tool etc. to help Machine Learning Studio(classic) customer…☆20Jul 23, 2024Updated last year
- In-Memory Java Compiler☆12Oct 13, 2020Updated 5 years ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- ZIO wrapper for AWS S3 SDK async client☆11Feb 21, 2020Updated 6 years ago
- Library which aim to generate kubernetes yamls templates from an Airflow dag using the Airflow Kuberntes Pod Operator☆10May 6, 2021Updated 4 years ago
- Azure IoT Hub Data Plane Python SDK☆17Jan 17, 2026Updated 2 months ago
- React Bootstrap 4 Tabs Component☆11Feb 3, 2023Updated 3 years ago
- ☆11Apr 29, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Write SQL in Scala☆30Nov 25, 2025Updated 4 months ago
- ⚛️ 🅱️ 🗂 ↔️ Off-canvas navigation for react using bootstrap☆14Apr 16, 2019Updated 6 years ago
- sbt plugin to detect Akka module mismatches and fail build☆10Sep 15, 2025Updated 6 months ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,596Updated this week
- dataX redis writer plugin☆12Jul 13, 2017Updated 8 years ago
- An Operator for Apache Druid for Stackable Data Platform☆12Mar 17, 2026Updated last week
- Horizon Exchange REST API Server☆11Jan 21, 2026Updated 2 months ago
- Supporting files for the Node.js In Action livevideo course from Manning☆15Nov 29, 2018Updated 7 years ago
- ☆10Apr 30, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- PySpark Tutorial for Beginners on Google Colab: Hands-On Guide☆17Sep 13, 2020Updated 5 years ago
- Elasticsearch querying library☆20Jun 16, 2019Updated 6 years ago
- Data Lineage Tracking And Visualization Solution☆656Updated this week
- ☆14Jul 14, 2022Updated 3 years ago
- Github bot for keeping your Bazel dependencies up-to-date and clean☆27Mar 20, 2020Updated 6 years ago
- Where the Meltano team runs Meltano! Get it???☆29Apr 9, 2025Updated 11 months ago
- Repository of the metadata specification mobilityDCAT-AP☆18Mar 17, 2026Updated last week