supermariolabs / spooq
☆38Updated 10 months ago
Alternatives and similar repositories for spooq:
Users that are interested in spooq are comparing it to the libraries listed below
- Code snippets used in demos recorded for the blog.☆30Updated this week
- Extensible streaming ingestion pipeline on top of Apache Spark☆44Updated last year
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆94Updated 3 weeks ago
- WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging …☆30Updated last week
- ☆19Updated 11 months ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Avro Schema Evolution made easy☆34Updated last year
- Flink stream filtering examples☆19Updated 8 years ago
- Dione - a Spark and HDFS indexing library☆51Updated last year
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆120Updated last week
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆75Updated 11 months ago
- CDF Tech Bootcamp☆9Updated 5 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆30Updated 4 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- Big Data Newsletter☆23Updated 11 months ago
- Convert XSD -> AVSC and XML -> AVRO☆36Updated 3 years ago
- A COBOL parser and Mainframe/EBCDIC data source for Apache Spark☆142Updated last week
- Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. The Starter Kit showcases the …☆21Updated last week
- Ecosystem website for Apache Flink☆11Updated last year
- Yet Another (Spark) ETL Framework☆20Updated last year
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Updated 3 months ago
- Dynamic Conformance Engine☆31Updated 3 months ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆84Updated this week
- Resilient data pipeline framework running on Apache Spark☆24Updated this week
- A Table format agnostic data sharing framework☆38Updated last year
- ☆63Updated 5 years ago
- Unity Catalog UI☆40Updated 6 months ago
- ☆27Updated 2 months ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆41Updated 6 months ago