provectus / streaming-data-platformLinks
☆24Updated 3 years ago
Alternatives and similar repositories for streaming-data-platform
Users that are interested in streaming-data-platform are comparing it to the libraries listed below
Sorting:
- Data Quality Gate based on AWS☆56Updated last year
- Airflow declarative DAGs via YAML☆132Updated last year
- Reference Dockerfiles for production usage☆24Updated 5 years ago
- ITSumma Spark Greenplum Connector☆38Updated last year
- Data Engineering Digest☆28Updated last year
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Deploy Presto on the cloud easily, using Terraform and Packer☆45Updated 2 years ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆69Updated 4 months ago
- Spark to Tableau Extractor library☆18Updated 7 years ago
- Command-line interface to quickly generate fake CSV and JSON data☆73Updated last year
- For a series of posts on Amazon MSK, Amazon EKS, and Amazon EMR☆67Updated 3 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- ☆95Updated 2 years ago
- Oozie Workflow to Airflow DAGs migration tool☆87Updated 4 months ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- Data validation library for PySpark 3.0.0☆33Updated 2 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆64Updated last year
- Quickstart PySpark with Anaconda on AWS/EMR using Terraform☆47Updated 6 months ago
- Open Source Kafka Connect Connector plugin repository built and maintained by Instaclustr☆12Updated 4 months ago
- Kafka Connect Store Partitioner by custom fields and time☆40Updated 3 years ago
- Aiven's collection of Single Message Transformations (SMTs) for Apache Kafka Connect☆80Updated 2 weeks ago
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆191Updated this week
- DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.☆266Updated 3 months ago
- Snowflake Kafka Connector (Sink Connector)☆158Updated last week
- Aiven's S3 Sink Connector for Apache Kafka®☆70Updated 10 months ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Updated 7 months ago
- Streaming Synthetic Sales Data Generator: Streaming sales data generator for Apache Kafka, written in Python☆44Updated 2 years ago
- Common Transforms for Kafka Connect.☆161Updated 11 months ago
- Convert XSD -> AVSC and XML -> AVRO☆36Updated 3 years ago