phdata / sdc-api-toolLinks

A set of utilities to help with management of Streamsets pipelines.

☆13

Alternatives and similar repositories for sdc-api-tool

Users that are interested in sdc-api-tool are comparing it to the libraries listed below

Sorting:

KeithSSmith / spark-compaction
File compaction tool that runs on top of the Spark framework.
☆59Updated 6 years ago
qubole / streamx
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
☆95Updated 6 years ago
sheetaldolas / Hive-JSON-Serde
Read - Write JSON SerDe for Apache Hive.
☆21Updated 6 years ago
laserson / avro2parquet
Hadoop MapReduce tool to convert Avro data files to Parquet format.
☆34Updated 12 years ago
hortonworks-spark / cloud-integration
Spark cloud integration: tests, cloud committers and more
☆19Updated 4 months ago
lensesio / kafka-connect-query-language
SQL for Kafka Connectors
☆98Updated last year
ExpediaGroup / circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
☆88Updated last year
ansrivas / spark-structured-streaming
Spark structured streaming with Kafka data source and writing to Cassandra
☆62Updated 5 years ago
wushujames / kafka-utilities
☆26Updated 5 years ago
zalando-incubator / spark-json-schema
JSON schema parser for Apache Spark
☆81Updated 2 years ago
bernhard-42 / Spark-ETL-Atlas
A small project to show how to add lineage to Atlas when using Spark as ETL tool
☆12Updated 8 years ago
AzimoLabs / kafka-to-avro-writer
Kafka to Avro Writer based on Apache Beam. It's a generic solution that reads data from multiple kafka topics and stores it on in cloud s…
☆25Updated 4 years ago
mayur2810 / sope
Apache Spark ETL Utilities
☆40Updated 7 months ago
mmolimar / ksql-jdbc-driver
JDBC driver for Apache Kafka
☆87Updated 3 years ago
Cargill / pipewrench
Data pipeline automation tool
☆26Updated last year
mr-jstraub / ambari-node-view
☆14Updated 8 years ago
ExpediaGroup / shunting-yard
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Updated 3 years ago
randerzander / r-service
Ambari Service definition for deploying R & RHadoop libraries
☆18Updated 9 years ago
FINRAOS / MegaSparkDiff
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…
☆51Updated last year
sgmarghade / json-to-avro-schema-generator
This will help you to generate AVRO schema from JSON schema.
☆34Updated 2 years ago
hortonworks-spark / spark-schema-registry
Schema Registry integration for Apache Spark
☆40Updated 2 years ago
CoxAutomotiveDataSolutions / waimak
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
☆75Updated last year
bcgov / nifi-atlas
A bridge to Apache Atlas for provenance metadata created in course of using Apache NiFi
☆15Updated 2 years ago
SponsorPay / jaquet
Spark stream from kafka(json) to s3(parquet)
☆15Updated 6 years ago
blackberry / KaBoom
A High Performance Cluster Consumer for Kafka that creates Avro (boom) files in Hadoop in time based directory paths
☆42Updated 9 years ago
sparsecode / DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…
☆26Updated 3 years ago
shivajid / atlas
This repository is to help with the Partner Demonstration of the Apache Atlas project.
☆30Updated 9 years ago
SharpRay / spark-druid-connector
A library for querying Druid data sources with Apache Spark
☆23Updated 4 years ago
aperepel / nifi-api-deploy
Demonstrates NiFi template deployment and configuration via a REST API
☆70Updated 8 years ago
Symantec / ambari-cassandra-service
☆26Updated 8 years ago