Extensible streaming ingestion pipeline on top of Apache Spark
☆46Jul 17, 2025Updated 8 months ago
Alternatives and similar repositories for hyperdrive
Users that are interested in hyperdrive are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Resilient data pipeline framework running on Apache Spark☆27Apr 1, 2026Updated last week
- Dynamic Conformance Engine☆32Mar 26, 2026Updated 2 weeks ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- A JDBC streaming source for Spark☆10Feb 19, 2024Updated 2 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Avro SerDe for Apache Spark structured APIs.☆242Jun 10, 2025Updated 9 months ago
- A COBOL parser and Mainframe/EBCDIC data source for Apache Spark☆161Mar 27, 2026Updated last week
- Cloud based Data Platform based on Apache Spark☆27Feb 17, 2026Updated last month
- Task Metrics Explorer☆14Apr 2, 2019Updated 7 years ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Example of using Faust with Docker☆23Sep 30, 2019Updated 6 years ago
- A simple Spark-powered ETL framework that just works 🍺☆186Oct 2, 2025Updated 6 months ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Jul 11, 2018Updated 7 years ago
- A simplified, lightweight ETL Framework based on Apache Spark☆587Jan 24, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 6 months ago
- Spark Structured Streaming JDBC Sink☆16Apr 26, 2021Updated 4 years ago
- Open Source Secret Provider plugin for the Kafka Connect framework☆47Jul 19, 2024Updated last year
- Data quality tools for Big Data☆19Oct 10, 2019Updated 6 years ago
- Gather system information about airflow processes☆18Mar 12, 2020Updated 6 years ago
- A Java JAXB library for generating events conforming to the Event Logging XML Schema☆16Aug 5, 2024Updated last year
- WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging …☆31Oct 28, 2025Updated 5 months ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆62Sep 6, 2024Updated last year
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆29May 15, 2020Updated 5 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- bat is a cat clone with wings! ...and Docker☆21Jun 18, 2019Updated 6 years ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- Utility for benchmarking changes in Spark using TPC-DS workloads☆16Jun 3, 2021Updated 4 years ago
- Data validation library for PySpark 3.0.0☆33Nov 11, 2022Updated 3 years ago
- Apache Amaterasu☆56Oct 18, 2019Updated 6 years ago
- Spark Library for Bulk Loading into Cassandra☆12Apr 18, 2018Updated 7 years ago
- ☆63Nov 8, 2019Updated 6 years ago
- Spark Structured Streaming State Tools☆34Jul 3, 2020Updated 5 years ago
- Collection of Interesting Algorithms☆16Oct 13, 2020Updated 5 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 4 years ago
- The dbt-spark-livy adapter allows you to use dbt along with Apache Spark, by connecting via Apache Livy☆12Mar 30, 2023Updated 3 years ago
- An example of SparkConnect extension.☆15Mar 5, 2024Updated 2 years ago
- A script to automate and simplify simple system tasks, such as service control, package control, system monitoring, pinging etc. This scr…☆10Nov 27, 2022Updated 3 years ago
- Akka plugin to collect various data about actors☆17Aug 19, 2024Updated last year
- ☆11Oct 11, 2022Updated 3 years ago
- Package to extend Airflow functionality with CWL v1.0 support☆12Jun 12, 2019Updated 6 years ago