BitwiseInc/Hydrograph

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/BitwiseInc/Hydrograph)

BitwiseInc / Hydrograph

A visual ETL development and debugging tool for big data

☆157

Alternatives and similar repositories for Hydrograph

Users that are interested in Hydrograph are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

wormsleep / etl
View on GitHub
ETL Tools 数据抽取-转换-加载工具
☆80Oct 21, 2016Updated 9 years ago
avensolutions / cdc-at-scale-using-spark
View on GitHub
Scalable CDC Pattern Implemented using PySpark
☆18Oct 8, 2025Updated 9 months ago
onc-healthit / fhir-tools
View on GitHub
Source code for the FHIR tools on SITE
☆13Nov 26, 2024Updated last year
datasphere-oss / datasphere-service
View on GitHub
an open source dataworks platform
☆20Jun 4, 2021Updated 5 years ago
grafeas / grafeas-pgsql
View on GitHub
Grafeas with PostgreSQL backend
☆15Jul 15, 2026Updated last week
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
TorchAIKC / nifi-stateless-operator
View on GitHub
An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes
☆53Jun 11, 2020Updated 6 years ago
uber / marmaray
View on GitHub
Generic Data Ingestion & Dispersal Library for Hadoop
☆483Mar 19, 2023Updated 3 years ago
datasphere-oss / datasphere-integration
View on GitHub
an data-centric integration platform
☆49Aug 9, 2021Updated 4 years ago
Nextdoor / bender
View on GitHub
Bender - Serverless ETL Framework
☆188Dec 19, 2023Updated 2 years ago
zeeshanabid94 / presto
View on GitHub
Distributed SQL query engine for big data
☆31Mar 12, 2019Updated 7 years ago
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
dobachi / ansible-bigdata
View on GitHub
Ansible playbooks to construct distributed computing environments
☆62Jun 6, 2021Updated 5 years ago
cartershanklin / hive-druid-ssb
View on GitHub
Star Schema Benchmark using the Hive / Druid Integration
☆30Nov 9, 2017Updated 8 years ago
NikhilSuthar / Scala-Spark-Mail
View on GitHub
Scala utility to send mail
☆14May 4, 2020Updated 6 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Teradata / kylo
View on GitHub
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies…
☆1,111Jan 12, 2023Updated 3 years ago
AlexMercedCoder / architecting-an-apache-iceberg-lakehouse
View on GitHub
☆17May 15, 2026Updated 2 months ago
wgzhao / presto-clickhouse
View on GitHub
ClickHouse connector both for PrestoSQL and Trino
☆18Mar 3, 2021Updated 5 years ago
zlmoment / Magnet-Links-Search-Engine
View on GitHub
It's a magnet links search engine build with python.
☆14Jul 22, 2014Updated 12 years ago
microsoft / pyspark_propensity_matching
View on GitHub
library for conducting propensity matching on spark scale
☆14Jun 27, 2023Updated 3 years ago
hortonworks / data_analytics_studio
View on GitHub
☆17Dec 7, 2022Updated 3 years ago
tspannhw / nifi-nlp-processor
View on GitHub
Apache NiFi NLP Processor
☆18May 8, 2026Updated 2 months ago
deepsense-ai / seahorse
View on GitHub
☆109Nov 9, 2022Updated 3 years ago
Cascading / cascading-hive
View on GitHub
Integration for Cascading and Apache Hive
☆25Oct 31, 2017Updated 8 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
saikrishnapujari / Spark-Nested-Data-Parser
View on GitHub
Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark
☆16Jan 22, 2024Updated 2 years ago
aravinthsci / Spark_Delta_Lake
View on GitHub
Delta Lake Examples
☆11Apr 24, 2020Updated 6 years ago
hygieia / hygieia
View on GitHub
CapitalOne DevOps Dashboard
☆3,817Sep 29, 2023Updated 2 years ago
cdapio / cdap
View on GitHub
An open source framework for building data analytic applications.
☆789Jul 13, 2026Updated last week
mayur2810 / sope
View on GitHub
Apache Spark ETL Utilities
☆40Oct 23, 2024Updated last year
ExpediaGroup / stream-registry
View on GitHub
Stream Discovery and Stream Orchestration
☆124Jan 7, 2026Updated 6 months ago
FINRAOS / HiveQLUnit
View on GitHub
Test your Hive scripts inside your favorite IDE with HiveQLUnit! Increase your developers productivity by testing on all operating system…
☆41Oct 13, 2020Updated 5 years ago
dazheng / SparkETL
View on GitHub
Implement a complete data warehouse etl using spark SQL
☆14Sep 8, 2022Updated 3 years ago
godatadriven / iterative-broadcast-join
View on GitHub
The iterative broadcast join example code.
☆71Oct 23, 2017Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
d2rq / r2rml-kit
View on GitHub
Implementation of W3C's R2RML and Direct Mapping specifications
☆10Oct 12, 2020Updated 5 years ago
allenday / R-Storm
View on GitHub
☆32May 17, 2015Updated 11 years ago
hbutani / icebergSQL
View on GitHub
Integration of Iceberg table management into Spark SQL
☆11Jan 21, 2020Updated 6 years ago
masayuki038 / calcite-arrow-sample
View on GitHub
calcite-arrow-sample(WIP)
☆13Dec 17, 2017Updated 8 years ago
tmalaska / Spark.TableStatsExample
View on GitHub
Simple Spark example of generating table stats for use of data quality checks
☆27Apr 28, 2017Updated 9 years ago
ExpediaGroup / datasqueeze
View on GitHub
Hadoop utility to compact small files
☆18Feb 16, 2026Updated 5 months ago
Hydrospheredata / mist
View on GitHub
Serverless proxy for Spark cluster
☆325Apr 13, 2026Updated 3 months ago