Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
☆98Apr 10, 2026Updated this week
Alternatives and similar repositories for flowman
Users that are interested in flowman are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A library enabling DAG structuring of data processing programs such as ETLs☆17Dec 13, 2025Updated 4 months ago
- Operator for Apache Superset for Stackable Data Platform☆35Updated this week
- ☆11May 16, 2022Updated 3 years ago
- Create a data mart using Azure Data Factory as ELT / ETL, Azure Synapse as database and Power BI as visualization tool.☆20Apr 20, 2022Updated 3 years ago
- Set of ETL utils for Spark☆15May 4, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆161Dec 10, 2022Updated 3 years ago
- Code examples for the Introduction to Kubeflow course☆15Jan 12, 2021Updated 5 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆588Jan 24, 2024Updated 2 years ago
- Template for a DuckDB-based, Codespace-oriented sandbox project that is also dbt Cloud compatible, and includes code-first BI tooling via…☆17Apr 7, 2023Updated 3 years ago
- Kubernetes operator for Apache Hadoop HDFS used by the Stackable Data Platform☆53Apr 2, 2026Updated last week
- ☆11Jun 16, 2023Updated 2 years ago
- ☆12Jul 10, 2022Updated 3 years ago
- Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former…☆37Jul 17, 2023Updated 2 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- Custom XML and JSON marshallers for Grails in an easy way☆30Oct 18, 2016Updated 9 years ago
- A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/☆19Updated this week
- Library which aim to generate kubernetes yamls templates from an Airflow dag using the Airflow Kuberntes Pod Operator☆10May 6, 2021Updated 4 years ago
- Azure IoT Hub Data Plane Python SDK☆17Jan 17, 2026Updated 2 months ago
- ☆13Dec 2, 2025Updated 4 months ago
- Code snippets used in demos recorded for the blog.☆41Mar 12, 2026Updated last month
- ☆11Apr 29, 2024Updated last year
- Write SQL in Scala☆30Nov 25, 2025Updated 4 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- sbt plugin to detect Akka module mismatches and fail build☆10Sep 15, 2025Updated 7 months ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,605Apr 1, 2026Updated 2 weeks ago
- dataX redis writer plugin☆12Jul 13, 2017Updated 8 years ago
- An Operator for Apache Druid for Stackable Data Platform☆12Mar 31, 2026Updated 2 weeks ago
- Horizon Exchange REST API Server☆11Jan 21, 2026Updated 2 months ago
- Supporting files for the Node.js In Action livevideo course from Manning☆15Nov 29, 2018Updated 7 years ago
- JNumberTools is an open-source Java library for solving complex problems in combinatorics and number theory. Whether you're a researcher,…☆15Mar 23, 2026Updated 3 weeks ago
- Elasticsearch querying library☆20Jun 16, 2019Updated 6 years ago
- Data Lineage Tracking And Visualization Solution☆657Apr 3, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Hadoop/Hive/Spark container to perform CI tests☆10Dec 26, 2020Updated 5 years ago
- A Kubernetes controller and tool for sealing/unsealing Secrets with the help of KMS providers.☆12Apr 20, 2025Updated 11 months ago
- ☆14Jul 14, 2022Updated 3 years ago
- Repository of the metadata specification mobilityDCAT-AP☆18Apr 3, 2026Updated last week
- Github bot for keeping your Bazel dependencies up-to-date and clean☆27Mar 20, 2020Updated 6 years ago
- Tools to work with the CRAB open data☆14Nov 5, 2013Updated 12 years ago
- Keras/Tensorflow 2D Convolutional MNIST Classifier☆11May 19, 2017Updated 8 years ago