Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
☆97Apr 15, 2026Updated 3 weeks ago
Alternatives and similar repositories for flowman
Users that are interested in flowman are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A library enabling DAG structuring of data processing programs such as ETLs☆17Apr 13, 2026Updated 3 weeks ago
- Operator for Apache Superset for Stackable Data Platform☆35Apr 29, 2026Updated last week
- ☆11May 16, 2022Updated 3 years ago
- Set of ETL utils for Spark☆15May 4, 2020Updated 6 years ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆160Dec 10, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A simplified, lightweight ETL Framework based on Apache Spark☆588Jan 24, 2024Updated 2 years ago
- Kubernetes operator for Apache Hadoop HDFS used by the Stackable Data Platform☆52Updated this week
- ☆12Jul 10, 2022Updated 3 years ago
- Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former…☆36Jul 17, 2023Updated 2 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- In-Memory Java Compiler☆12Oct 13, 2020Updated 5 years ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/☆19Apr 27, 2026Updated last week
- ZIO wrapper for AWS S3 SDK async client☆11Feb 21, 2020Updated 6 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- React Bootstrap 4 Tabs Component☆11Feb 3, 2023Updated 3 years ago
- Code snippets used in demos recorded for the blog.☆41Updated this week
- Trino load balancer with support for routing, queueing and auto-scaling☆37Apr 20, 2026Updated 2 weeks ago
- ⚛️ 🅱️ 🗂 ↔️ Off-canvas navigation for react using bootstrap☆14Apr 16, 2019Updated 7 years ago
- sbt plugin to detect Akka module mismatches and fail build☆10Sep 15, 2025Updated 7 months ago
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,614Updated this week
- dataX redis writer plugin☆12Jul 13, 2017Updated 8 years ago
- An Operator for Apache Druid for Stackable Data Platform☆12Apr 29, 2026Updated last week
- Horizon Exchange REST API Server☆11Jan 21, 2026Updated 3 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PySpark Tutorial for Beginners on Google Colab: Hands-On Guide☆17Sep 13, 2020Updated 5 years ago
- Elasticsearch querying library☆20Jun 16, 2019Updated 6 years ago
- Data Lineage Tracking And Visualization Solution☆657Apr 23, 2026Updated last week
- ☆14Jul 14, 2022Updated 3 years ago
- Github bot for keeping your Bazel dependencies up-to-date and clean☆27Mar 20, 2020Updated 6 years ago
- Where the Meltano team runs Meltano! Get it???☆31Apr 9, 2025Updated last year
- This dbt package contains macros to support unit testing that can be (re)used across dbt projects.☆448Feb 11, 2025Updated last year
- Kubernetes Operator for Apache HBase built by Stackable for the Stackable Data Platform☆20Apr 29, 2026Updated last week
- UI for mondrian-rest☆20Apr 17, 2019Updated 7 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Capture, save, and analyze AWS Redshift performance metrics☆17Oct 6, 2017Updated 8 years ago
- JNumberTools is an open-source Java library for solving complex problems in combinatorics and number theory. Whether you're a researcher,…☆15Mar 23, 2026Updated last month
- A Hello World VS Code extension with ScalaJS.☆14Nov 9, 2023Updated 2 years ago
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- A simple Spark-powered ETL framework that just works 🍺☆186Oct 2, 2025Updated 7 months ago
- A domain specific language for creating scientific pipelines☆15Apr 21, 2026Updated 2 weeks ago
- Stackable Operator for Apache Kafka☆28Updated this week