Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
☆47Jul 13, 2022Updated 3 years ago
Alternatives and similar repositories for modern-data-lake-storage-layers
Users that are interested in modern-data-lake-storage-layers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository provides the resources required for the Amazon Redshift Streaming workshop☆13Jul 12, 2023Updated 2 years ago
- Amazon EMR Notebook to show how to read from and write to Delta tables with Amazon EMR☆17Apr 27, 2025Updated 11 months ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆53Oct 31, 2023Updated 2 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆66Sep 23, 2023Updated 2 years ago
- Proof of concept of a big data cluster using open source tools☆11Apr 10, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Delta-Lake, ETL, Spark, Airflow☆49Oct 9, 2022Updated 3 years ago
- Auto-fixing error due to version upgrade, good practice etc.☆11Sep 5, 2020Updated 5 years ago
- RFC (request for comments) for changes to RisingWave☆18Dec 4, 2024Updated last year
- ☆16May 9, 2022Updated 3 years ago
- ☆18Jun 16, 2024Updated last year
- ☆11Apr 27, 2021Updated 4 years ago
- 🌉 Reference implementation for granting cross-account AWS Glue Data Catalog access from Amazon Athena☆30Jul 25, 2022Updated 3 years ago
- ☆32Jan 30, 2026Updated 2 months ago
- Modernize seu Data Warehouse☆15Nov 12, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- AWS Lambda function - automatically PGP encrypts files added to S3 bucket☆16May 3, 2022Updated 3 years ago
- FHIR Storage, Query & Analytics Track☆19Sep 6, 2018Updated 7 years ago
- Bits of code I use during live demos☆30Dec 19, 2024Updated last year
- Sample datasets and code for operationalizing Amazon Fraud Detector using SageMaker DataWrangler, Feature Store, and Pipelines.☆18Dec 1, 2022Updated 3 years ago
- EMR Hudi Workshop content☆12Dec 10, 2021Updated 4 years ago
- Companion repository for the "Streamlining AWS Glue CI/CD — A Comprehensive Blueprint" blog post☆11Nov 8, 2024Updated last year
- Repositório dedicado a Workshop de Data Lakehouse com Delta Lake☆17Dec 6, 2021Updated 4 years ago
- ☆18Apr 14, 2023Updated 3 years ago
- docs, codes and resources to prepare for the CRT020: Databricks Certified Associate Developer for Apache Spark 2.4 with Python 3 certific…☆10Sep 25, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A collection of utility scripts to manage images☆12Mar 31, 2026Updated 2 weeks ago
- Repository for the paper "Discovering and Categorising Language Biases in Reddit" accepted at the International Conference on Web and Soc…☆12Aug 20, 2024Updated last year
- Code to munge data between Kaggle .tsv Rotten Tomatoes Sentiment Analysis data set and Vowpal Wabbit☆24Jun 22, 2014Updated 11 years ago
- this repogitory describe how to use avro-tools☆12Feb 21, 2018Updated 8 years ago
- A BigQuery adapter for Harlequin, a SQL IDE for the terminal.☆10Jan 19, 2025Updated last year
- ☆21Dec 3, 2025Updated 4 months ago
- Example code for running Spark and Hive jobs on EMR Serverless.☆169Mar 11, 2026Updated last month
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Sep 7, 2022Updated 3 years ago
- dbt / Amazon Redshift Demonstration Project☆34Jan 3, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Making the transition from Scratch to Python☆11Apr 11, 2017Updated 9 years ago
- Unity Catalog Explorer is a TypeScript + Next.js based Web UI for the Unity Catalog OSS.☆13Jun 29, 2024Updated last year
- A Kivy tutorial for PyOhio 2013☆14Apr 30, 2014Updated 11 years ago
- ☆11Oct 13, 2025Updated 6 months ago
- ☆12Aug 17, 2023Updated 2 years ago
- "유닉스 리눅스 셸 스크립트 예제 사전: Unix & Linux Shell Script Exercise Dictionary" - 한빛미디어☆10Jan 17, 2017Updated 9 years ago
- A dotnet standard wrapper for the Uniswap V2 Subgraph on The Graph GraphQL API.☆12Dec 17, 2020Updated 5 years ago