Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize and recommend app.
☆40Dec 15, 2025Updated 4 months ago
Alternatives and similar repositories for building-lakehouse
Users that are interested in building-lakehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Đồ án tốt nghiệp | Data Lakehouse☆41Feb 9, 2026Updated 2 months ago
- Trino On K8S Via Helm & Metastore Workshop Querying Delta Tables☆12Jan 27, 2025Updated last year
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆76Sep 2, 2023Updated 2 years ago
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 11 months ago
- Data Pipeline that utilizes GCP, Python 3.10, Prefect, and more.☆10Jan 23, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆11Nov 26, 2024Updated last year
- A turnkey MLOps pipeline demonstrating how to go from raw events to real-time predictions at scale.☆243Oct 21, 2025Updated 5 months ago
- ☆11Oct 19, 2023Updated 2 years ago
- Simple Syntax-Directed Interpreter for a subset of SQL☆13Jan 3, 2011Updated 15 years ago
- SQLMesh example projects☆40Jul 2, 2025Updated 9 months ago
- ☆10Nov 2, 2023Updated 2 years ago
- trino monitoring with JMX metrics through Prometheus and Grafana☆17Aug 14, 2024Updated last year
- A collection of examples built with AWS DataOps Development Kit (DDK)☆43Mar 23, 2026Updated 3 weeks ago
- ICDE 2025 Paper, Grounding Natural Language to SQL Translation with Data-Based Self-Explanations☆17May 24, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- zetaSQL analyzer☆19Sep 11, 2020Updated 5 years ago
- VSCode extension for working with Architecture As A Code in the C4 model. Includes syntax highlighting, diagram preview, and tools for wo…☆36Apr 7, 2026Updated last week
- ☆26Jun 29, 2023Updated 2 years ago
- Short Range Ultrasonic Radar - A simple radar using the ultrasonic sensor, this radar works by measuring a range from 3cm to 40 cm as non…☆19Nov 11, 2024Updated last year
- Spark-based pipeline to extract and parse monthly games from the Lichess database.☆21Sep 22, 2025Updated 6 months ago
- In this project I have built etl pipline which scraps the trending repository based on month,week and day LIVE extract other related info…☆12Sep 9, 2023Updated 2 years ago
- A Python CLI application that demonstrates how you can access AWS services, such as Amazon S3 and Amazon Athena, using trusted identity p…☆13Mar 11, 2025Updated last year
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- ☆11Aug 20, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Apr 7, 2026Updated last week
- Streaming Generative AI Application on AWS☆14Jun 24, 2024Updated last year
- Automate data collection from Spotify's worldwide ranking in 50+ countries☆24May 3, 2020Updated 5 years ago
- A Table format agnostic data sharing framework☆42Feb 4, 2024Updated 2 years ago
- This is a simple script that parses python files in a directory and generates a mxfile containing a diagramm of classes, attributes and m…☆11Feb 23, 2023Updated 3 years ago
- A Spark Connector that reads data from / writes data to Arrow-Flight end-points with Arrow-Flight and Flight-SQL☆48Dec 14, 2025Updated 4 months ago
- Code for the paper: Kernel Distributionally Robust Optimization☆13Feb 21, 2021Updated 5 years ago
- ICDE 2023 Paper, GAR: A Generate-and-Rank Approach for Natural Language to SQL Translation☆19Sep 19, 2023Updated 2 years ago
- Robust Bond Portfolio Construction via Convex-Concave Saddle Point Optimization☆13May 13, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆16Feb 11, 2026Updated 2 months ago
- Serverless Multi-Tenant Application on AWS Amplify☆17Jan 11, 2024Updated 2 years ago
- Real-time OLTP system for credit card fraud detection using AWS API Gateway, Kinesis, and RDS PostgreSQL. Features a scalable, serverless…☆24Dec 16, 2024Updated last year
- Sample application showcasing the use of Dapr to build microservices based apps☆15Feb 4, 2026Updated 2 months ago
- Sample code and documentation for very basic things that I can't remember but want to aggregate in one place☆13Nov 7, 2021Updated 4 years ago
- Ecommerce Realtime Data Pipeline (Data Modeling, Workflow Orchestration, Change Data Capture, Analytical Database and Dashboarding)☆66Mar 9, 2024Updated 2 years ago
- Python packages for Support Vector Regression with Linear Constraints☆10Jul 9, 2020Updated 5 years ago