getindata / jupyter-images
Receipes of publicly-available Jupyter images
☆8Updated last month
Related projects ⓘ
Alternatives and complementary repositories for jupyter-images
- GetInData Helm Charts repository☆12Updated 2 years ago
- Big Data Newsletter☆22Updated 6 months ago
- Using the Parquet file format (with Avro) to process data with Apache Flink☆14Updated 9 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 7 years ago
- ☆16Updated last year
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 7 months ago
- Dione - a Spark and HDFS indexing library☆50Updated 7 months ago
- ☆18Updated 6 months ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆11Updated 5 years ago
- ☆10Updated 2 years ago
- Kubernetes Operator for the Ververica Platform☆34Updated last year
- Automatically loads new partitions in AWS Athena☆18Updated 4 years ago
- 📆 Run, schedule, and manage your dbt jobs using Kubernetes.☆24Updated 6 years ago
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆66Updated 8 months ago
- A tool to create Airflow RBAC roles with dag-level permissions from cli.☆13Updated last year
- KSQL Syntax Highlighting for VSCode☆16Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated last week
- HDFS Automatic Snapshot Service for Linux☆12Updated 8 years ago
- Amundsen Gremlin☆20Updated 2 years ago
- ☆43Updated 3 months ago
- An Ansible collection for lifecycle and management of Cloudera CDP Private Cloud resources on bare metal, IaaS, and PaaS.☆32Updated last month
- Examples of user defined functions for Apache Drill☆19Updated 7 years ago
- ☆24Updated 2 months ago
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆16Updated 3 years ago
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆86Updated 8 months ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Updated 8 years ago