aws-samples / iceberg-streaming-examples
This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenarios using best practices. The code can be deployed into any Spark compatible engine like Amazon EMR Serverless or AWS Glue. A fully local developer environment is also provided.
☆13Updated last week
Related projects ⓘ
Alternatives and complementary repositories for iceberg-streaming-examples
- Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3☆18Updated 2 months ago
- ☆13Updated 3 months ago
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆21Updated last year
- ☆15Updated last year
- Sample code to collect Apache Iceberg metrics for table monitoring☆19Updated 2 months ago
- Starter project to create a dashboard to interact with Amazon Connect Global Resiliency APIs☆11Updated 8 months ago
- Streamlit exmaples with LLM from Bedrock☆12Updated 3 months ago
- ☆12Updated 11 months ago
- ☆9Updated 5 months ago
- This repository contains the source code of the Verifiable Controls Evidence Store solution☆19Updated last year
- Describes the concepts of lambda architecture and the actual deployment process with an example of building a serverless business intelli…☆13Updated 5 months ago
- dApp authentication with Amazon Cognito and Web3 proxy with Amazon API Gateway☆13Updated 3 months ago
- ☆30Updated 8 months ago
- The Generative AI Atlas is an organized repository designed for individuals seeking to explore the newest content released by AWS on Gene…☆14Updated last month
- This project is an example of using AWS Step functions to manage and track a series of AWS Batch jobs in N_TO_N mode.☆11Updated 9 months ago
- Seed-Farmer is an orchestration tool that works with AWS CodeSeeder and acts as an orchestration tool modeled after GitOps deployments. I…☆45Updated this week
- Detect AWS usage anomalies in near-real time using OpenSearch Anomaly Detection and CloudTrail for improved cost management and security☆30Updated 5 months ago
- ☆11Updated 2 months ago
- An Amazon Kendra REST API CDK example with an API Gateway, including authentication with AWS Cognito and AWS X-Ray Tracing☆15Updated 3 months ago
- Serverless Datalake architecture☆13Updated last year
- ☆41Updated 8 months ago
- This repository contains the code and infrastructure as code for a Generative AI-powered Request for Proposal (RFP) Assistant leveraging …☆15Updated last week
- ☆15Updated 4 months ago
- This repository shows how to setup Centralized CloudWatch Observability Manager using Terraform☆15Updated 8 months ago
- ☆14Updated 9 months ago
- Using TypeScript and the AWS CDK, you can integrate Knowledge Bases into Amazon Bedrock to provide foundation models with contextual data…☆12Updated 6 months ago
- A collection of examples built with AWS DataOps Development Kit (DDK)☆39Updated 3 months ago
- ☆13Updated last year
- Stream CDC into an Amazon S3 data lake in Apache Iceberg table format with AWS Glue Streaming and DMS☆25Updated 2 weeks ago
- ☆16Updated 7 months ago