ttauveron / k8s-big-data-experiments
Experiments produced during an end of studies project (ETS, H2018)
☆15Updated 6 years ago
Alternatives and similar repositories for k8s-big-data-experiments:
Users that are interested in k8s-big-data-experiments are comparing it to the libraries listed below
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Scala SDK for working with Snowplow enriched events in Spark, AWS Lambda, Flink et al.☆20Updated 2 months ago
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago
- Delta Lake Examples☆12Updated 4 years ago
- A facebook for data☆26Updated 5 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆69Updated last year
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆81Updated 4 years ago
- AMQP data source for dstream (Spark Streaming)☆26Updated 2 years ago
- REST-like API exposing Airflow data and operations☆61Updated 6 years ago
- ☆26Updated 8 years ago
- Apache (Py)Spark type annotations (stub files).☆115Updated 2 years ago
- Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.☆28Updated 7 years ago
- A library for Spark DataFrame using MinIO Select API☆97Updated 5 years ago
- Docker Image and Kubernetes Configurations for Spark 2.x☆41Updated 5 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 years ago
- Dockerized Hadoop/Minio/Hive/Presto stack☆36Updated 10 months ago
- type-class based data cleansing library for Apache Spark SQL☆79Updated 5 years ago
- Ansible playbooks for Apache Spark on kube☆27Updated 7 years ago
- Spark on Kubernetes infrastructure Docker images repo☆37Updated 2 years ago
- Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.☆75Updated 8 months ago
- Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.☆18Updated 7 years ago
- ☆63Updated 5 years ago
- SQL for Kafka Connectors☆97Updated last year
- Lighthouse is a library for data lakes built on top of Apache Spark. It provides high-level APIs in Scala to streamline data pipelines an…☆61Updated 4 months ago
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Updated 6 years ago
- Task Metrics Explorer☆13Updated 5 years ago