gschmutz / various-demosLinks
Various Demos mostly based on docker environments
☆33Updated 3 years ago
Alternatives and similar repositories for various-demos
Users that are interested in various-demos are comparing it to the libraries listed below
Sorting:
- spark on kubernetes☆104Updated 2 years ago
- Kafka Connect REST connector☆112Updated 3 years ago
- ❤for real-time DataOps - where the application and data fabric blends - Lenses☆160Updated 3 weeks ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆97Updated this week
- Data validation library for PySpark 3.0.0☆33Updated 3 years ago
- Takes a kafka stream into spark, apply transformations and sink into Druid. Everything Dockerised.☆30Updated 2 years ago
- How to manage Slowly Changing Dimensions with Apache Hive☆55Updated 6 years ago
- Presentations and other resources.☆36Updated 5 years ago
- A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python,…☆209Updated 6 years ago
- Code and presentation for Strata Model Serving tutorial☆68Updated 6 years ago
- A proof of concept using Divolte, Kafka, Druid and Superset☆62Updated 5 years ago
- Collection of examples integrating NiFi with stream process frameworks.☆59Updated 9 years ago
- ☆81Updated 2 years ago
- Get started with Apache Beam and Flink☆43Updated 9 years ago
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆120Updated 4 years ago
- Kafka Connect connector for reading CSV files into Kafka.☆170Updated 7 months ago
- Multiple node presto cluster on docker container☆126Updated 3 years ago
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆62Updated last year
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆52Updated 6 months ago
- Deep Learning UDF for KSQL, the Streaming SQL Engine for Apache Kafka with Elasticsearch Sink Example☆79Updated 7 years ago
- ☆64Updated last year
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆78Updated last week
- ☆100Updated 2 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆70Updated 4 months ago
- Kafka Connect FileSystem Connector☆112Updated 3 years ago
- A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Arc…☆188Updated 4 months ago
- Basic framework utilities to quickly start writing production ready Apache Spark applications☆36Updated last year
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆91Updated last year
- Fully reproducible, Dockerized, step-by-step, demo on how to stream tables from Postgres to Kafka/KSQL back to Postgres. Detailed blog p…☆152Updated 4 years ago
- Real-world Spark pipelines examples☆83Updated 7 years ago