Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
☆77Feb 15, 2023Updated 3 years ago
Alternatives and similar repositories for delta-architecture
Users that are interested in delta-architecture are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Sep 7, 2022Updated 3 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆32Updated this week
- Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, shor…☆72Aug 27, 2025Updated 9 months ago
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12May 22, 2026Updated 3 weeks ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repo demonstrates how to use AWS application auto-scaling to implement custom-scaling in your Kinesis Data Analytics for Apache Flin…☆19Feb 21, 2025Updated last year
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Jun 7, 2021Updated 5 years ago
- ☆25Mar 15, 2024Updated 2 years ago
- Demo of using Airflow☆11Jun 24, 2022Updated 3 years ago
- A custom end-to-end analytics platform for customer churn☆10May 15, 2025Updated last year
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆30May 13, 2026Updated last month
- Spring Boot autoconfiguration for JMutsache (web and nonweb template rendering)☆14Jun 15, 2020Updated 6 years ago
- ☆64Nov 8, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Sample processing code using Spark 2.1+ and Scala☆51Jun 28, 2020Updated 5 years ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Notebooks Python Machine Learning☆20Feb 6, 2020Updated 6 years ago
- The Internals of Delta Lake☆186May 10, 2026Updated last month
- Only if you take ping pong seriously. Seriously.☆14Dec 4, 2022Updated 3 years ago
- ☆12Oct 24, 2025Updated 7 months ago
- Extensible streaming ingestion pipeline on top of Apache Spark☆47Jul 17, 2025Updated 11 months ago
- Plot live-stats as graph from ApacheSpark application using Lightning-viz☆18Jul 3, 2017Updated 8 years ago
- ☆36Aug 24, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Learn Kubeflow with Arrikto☆15Jan 4, 2022Updated 4 years ago
- End-to-end proof of concept showing core MLOps practices to develop, deploy and monitor a machine learning model for an employee retentio…☆17May 28, 2024Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆51Dec 2, 2023Updated 2 years ago
- Power a Spark Stream from anywhere in your Akka Stream Flow☆12Mar 1, 2016Updated 10 years ago
- Crash course in Scala☆22Apr 14, 2020Updated 6 years ago
- Spark SQL DBF Library☆16Jan 2, 2015Updated 11 years ago
- ☆15Apr 13, 2026Updated 2 months ago
- Html Content / Article Extractor in Scala☆18May 23, 2018Updated 8 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23May 7, 2018Updated 8 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Realistic sample value generators for Scala.☆16Jul 4, 2024Updated last year
- How to implement a streaming at scale solution in Azure☆233Oct 23, 2024Updated last year
- A minimal seed template for an Akka gRPC with Scala build☆19Jun 4, 2026Updated 2 weeks ago
- Data Exploration Using Spark 2.0☆14Apr 17, 2018Updated 8 years ago
- Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream …☆22Feb 6, 2017Updated 9 years ago
- Give your AI assistant access to your Fitbit data for personalized health insights, trend analysis, and automated tracking. Works with Cl…☆32May 27, 2025Updated last year
- TypeScript Maven Plugin☆34Mar 12, 2016Updated 10 years ago