dimajix/flowman

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dimajix/flowman)

dimajix / flowman

Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.

☆97

Alternatives and similar repositories for flowman

Users that are interested in flowman are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Azure / DW-with-Synapse-Data-Factory-Power-BI
View on GitHub
Create a data mart using Azure Data Factory as ELT / ETL, Azure Synapse as database and Power BI as visualization tool.
☆19Apr 20, 2022Updated 4 years ago
Nike-Inc / spark-expectations
View on GitHub
A Python Library to support running data quality rules while the spark job is running⚡
☆201Jul 14, 2026Updated last week
datastacktv / kubeflow-introduction
View on GitHub
Code examples for the Introduction to Kubeflow course
☆15Jan 12, 2021Updated 5 years ago
konrads / spark-etl
View on GitHub
Set of ETL utils for Spark
☆15May 4, 2020Updated 6 years ago
YotpoLtd / metorikku
View on GitHub
A simplified, lightweight ETL Framework based on Apache Spark
☆588Jan 24, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kedacore / external-scalers
View on GitHub
Explore external scalers built by the community.
☆12Jun 15, 2026Updated last month
davidkarlsen / flyway-operator
View on GitHub
k8s operator for Flyway migrations
☆12Updated this week
Azure / data-product-streaming
View on GitHub
Template to deploy a Data Product for data stream processing into a Data Landing Zone of the Data Management & Analytics Scenario (former…
☆36Jul 17, 2023Updated 3 years ago
intuit / superglue
View on GitHub
Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …
☆160Dec 10, 2022Updated 3 years ago
meltano / jaffle-shop-template
View on GitHub
Template for a DuckDB-based, Codespace-oriented sandbox project that is also dbt Cloud compatible, and includes code-first BI tooling via…
☆17Apr 7, 2023Updated 3 years ago
Azure / Azure-IoT-Security
View on GitHub
Secure Azure IoT solutions end to end
☆14Nov 28, 2022Updated 3 years ago
picadoh / imc
View on GitHub
In-Memory Java Compiler
☆12Oct 13, 2020Updated 5 years ago
MarwaEshra / Create-Interactive-Dashboards-with-Streamlit-and-Python
View on GitHub
Create Interactive Dashboards with Streamlit and Python Coursera
☆10Jun 19, 2020Updated 6 years ago
scholzj / kafka-kubernetes-authenticator
View on GitHub
Kafka Kubernetes Authenticator and Authorizer
☆12Sep 5, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jasonsatran / spark-meta
View on GitHub
Spark data profiling utilities
☆23Nov 24, 2018Updated 7 years ago
avensolutions / spark-sql-etl-framework
View on GitHub
Multi-stage, config driven, SQL based ETL framework using PySpark
☆26Sep 16, 2019Updated 6 years ago
Gamesight / secret-sync-operator
View on GitHub
A Kubernetes operator enabling cross-cluster secret syncing
☆13Dec 16, 2025Updated 7 months ago
mobilityDCAT-AP / mobilityDCAT-AP
View on GitHub
Repository of the metadata specification mobilityDCAT-AP
☆18Updated this week
MeltanoLabs / meltano-map-transform
View on GitHub
A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/
☆19Updated this week
BranislavLazic / aws-zio-s3
View on GitHub
ZIO wrapper for AWS S3 SDK async client
☆11Feb 21, 2020Updated 6 years ago
curityio / idsvr-helm
View on GitHub
This repository contains the Curity Identity Server helm chart source code.
☆11Jun 16, 2026Updated last month
indiacloudtv / pyspark_on_google_colab
View on GitHub
PySpark Tutorial for Beginners on Google Colab: Hands-On Guide
☆17Sep 13, 2020Updated 5 years ago
bartosz25 / spark-playground
View on GitHub
Code snippets used in demos recorded for the blog.
☆42Apr 30, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
johanandren / sbt-akka-version-check
View on GitHub
sbt plugin to detect Akka module mismatches and fail build
☆10Sep 15, 2025Updated 10 months ago
openshift-integration / camel-k-example-knative
View on GitHub
☆11Apr 29, 2024Updated 2 years ago
awslabs / deequ
View on GitHub
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
☆3,637Updated this week
stackabletech / trino-lb
View on GitHub
Trino load balancer with support for routing, queueing and auto-scaling
☆37Jul 1, 2026Updated 3 weeks ago
kiwiz / esquery
View on GitHub
Elasticsearch querying library
☆20Jun 16, 2019Updated 7 years ago
Azure / Azure-Machine-Learning-Adoption-Framework
View on GitHub
this repo provides best practice guidance, plan template, solution assessment tool etc. to help Machine Learning Studio(classic) customer…
☆20Jul 23, 2024Updated 2 years ago
AbsaOSS / spline
View on GitHub
Data Lineage Tracking And Visualization Solution
☆663Updated this week
absognety / atomic-scala
View on GitHub
Atomic Scala Book Solutions - for Beginners and first time Functional Programmers
☆12Mar 10, 2020Updated 6 years ago
rueian / kinko
View on GitHub
A Kubernetes controller and tool for sealing/unsealing Secrets with the help of KMS providers.
☆12Apr 20, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
rockthejvm / cqrs-akka-cassandra-demo
View on GitHub
☆14Jul 14, 2022Updated 4 years ago
zjffdu / zeppelin-notebook
View on GitHub
☆12Jul 10, 2022Updated 4 years ago
joomcode / spark-platform
View on GitHub
Basic Spark utilities
☆13Updated this week
ruzickap / k8s-istio-demo
View on GitHub
Demo showing the capabilities of Istio
☆25Aug 20, 2024Updated last year
Azure / IoT-Pi-Day
View on GitHub
Workshop to build out a real-life IoT scenario by capturing IoT data and ingesting it into the Azure Cloud.
☆29Dec 8, 2022Updated 3 years ago
SETL-Framework / setl
View on GitHub
A simple Spark-powered ETL framework that just works 🍺
☆186Oct 2, 2025Updated 9 months ago
joswlv / Spark2CassandraBulkLoad
View on GitHub
Spark Library for Bulk Loading into Cassandra
☆12Jan 28, 2021Updated 5 years ago