phdata / sdc-api-tool
A set of utilities to help with management of Streamsets pipelines.
☆13Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for sdc-api-tool
- Data pipeline automation tool☆25Updated 10 months ago
- SQL for Kafka Connectors☆97Updated 10 months ago
- JSON schema parser for Apache Spark☆81Updated 2 years ago
- File compaction tool that runs on top of the Spark framework.☆59Updated 5 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Hadoop MapReduce tool to convert Avro data files to Parquet format.☆34Updated 11 years ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆95Updated 5 years ago
- Test suite for Kafka Connect connectors based on Landoop's Coyote and docker.☆32Updated 5 years ago
- Spark structured streaming with Kafka data source and writing to Cassandra☆64Updated 4 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆48Updated 10 months ago
- ☆26Updated 4 years ago
- A High Performance Cluster Consumer for Kafka that creates Avro (boom) files in Hadoop in time based directory paths☆42Updated 8 years ago
- This will help you to generate AVRO schema from JSON schema.☆35Updated 2 years ago
- Schema Registry integration for Apache Spark☆39Updated 2 years ago
- Custom state store providers for Apache Spark☆93Updated 2 years ago
- Flink Examples☆39Updated 8 years ago
- Spark stream from kafka(json) to s3(parquet)☆15Updated 6 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆158Updated 2 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- Kafka-Connect SMT (Single Message Transformations) with SQL syntax (Using Apache Calcite for the SQL parsing)☆32Updated 4 years ago
- Camus Compressor merges files created by Camus and saves them in a compressed format.☆12Updated last year
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 2 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 7 years ago
- Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.☆20Updated 3 years ago
- Demonstrates NiFi template deployment and configuration via a REST API☆68Updated 7 years ago
- Repository for advanced unit-testing with embedded kafka services☆25Updated 5 years ago