streamsets / pipeline-libraryLinks

Pipeline library for StreamSets Data Collector and Transformer

☆33

Alternatives and similar repositories for pipeline-library

Users that are interested in pipeline-library are comparing it to the libraries listed below

Sorting:

TrivadisPF / platys-modern-data-platform
Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....
☆77Updated last week
cdapio / hydrator-plugins
Cask Hydrator Plugins Repository
☆68Updated 2 weeks ago
bbende / nifi-streaming-examples
Collection of examples integrating NiFi with stream process frameworks.
☆59Updated 9 years ago
odpi / data-governance
Egeria's Guidance on Governance as well as large media files such as presentations and movies
☆106Updated 3 years ago
TorchAIKC / nifi-stateless-operator
An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes
☆53Updated 5 years ago
cloudera-labs / edge2ai-workshop
☆31Updated 2 months ago
smart-data-lake / smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
☆123Updated this week
apache / incubator-datalab
Apache DataLab (incubating)
☆152Updated 2 years ago
rahulcodewiz / spark-drools
spark-drools tutorials
☆16Updated last year
saikrishnapujari / Spark-Drools-Integration
☆23Updated 6 years ago
projectnessie / nessie-demos
Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.
☆30Updated last week
data-integrations / wrangler
Wrangler Transform: A DMD system for transforming Big Data
☆106Updated 3 months ago
sparsecode / DaFlow
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…
☆26Updated 4 years ago
BrooksIan / Flink2Kafka
A Flink applcation that demonstrates reading and writing to/from Apache Kafka with Apache Flink
☆20Updated 2 years ago
arempter / hive-metastore-docker
Example for article Running Spark 3 with standalone Hive Metastore 3.0
☆102Updated 2 years ago
sibytes / yetl
Yet Another (Spark) ETL Framework
☆21Updated 2 years ago
dremio-hub / dremio-flight-connector
Dremio Flight connector. Access Dremio using Arrow flight
☆39Updated 4 years ago
polyzos / stream-processing-with-apache-flink
☆62Updated last year
microsoft / MonitoFi
MonitoFi: Health & Performance Monitor for your Apache NiFi
☆67Updated 2 years ago
cloudcheflabs / dataroaster
☆40Updated 2 years ago
agile-lab-dev / DataQuality
DataQuality for BigData
☆144Updated last year
godatadriven / dbt-data-ai-summit
Code that was used as an example during the Data+AI Summit 2020
☆15Updated 4 years ago
g1thubhub / phil_stopwatch
☆39Updated 6 years ago
XavientInformationSystems / Data-Ingestion-Platform
☆50Updated 5 years ago
aws-samples / realtime-bushfire-alert-with-apache-flink-cep
Code and documentation for the demonstration example of the real-time bushfire alerting with the Complex Event Processing (CEP) in Apache…
☆26Updated 7 years ago
ververica / lab-fraud-detection
Demo code for implementing and showcasing a Fraud Detection Engine with Apache Flink.
☆32Updated 3 years ago
streamsets / datacollector-docker
Dockerfiles for StreamSets Data Collector
☆114Updated 9 months ago
dimajix / flowman
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…
☆97Updated last month
0x0ece / beam-starter
Get started with Apache Beam and Flink
☆43Updated 9 years ago
minio / spark-select
A library for Spark DataFrame using MinIO Select API
☆99Updated 6 years ago