Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize and recommend app.
☆39Dec 15, 2025Updated 3 months ago
Alternatives and similar repositories for building-lakehouse
Users that are interested in building-lakehouse are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆51Dec 2, 2023Updated 2 years ago
- Đồ án tốt nghiệp | Data Lakehouse☆40Feb 9, 2026Updated last month
- ☆67Sep 24, 2025Updated 6 months ago
- The Data Pipeline and Analytics Stack is a comprehensive solution designed for processing, storing, and visualizing data. Explore a compl…☆17Dec 26, 2023Updated 2 years ago
- Trino On K8S Via Helm & Metastore Workshop Querying Delta Tables☆12Jan 27, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Local AWS - a lightweight AWS service emulator☆35Mar 22, 2026Updated last week
- A custom end-to-end analytics platform for customer churn☆11May 15, 2025Updated 10 months ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆66Sep 23, 2023Updated 2 years ago
- Reusable Python classes that extend open source PySpark capabilities. Examples of implementation is available under notebooks of repo htt…☆13Nov 1, 2024Updated last year
- 📡 Real-time data pipeline with Kafka, Flink, Iceberg, Trino, MinIO, and Superset. Ideal for learning data systems.☆62Jan 18, 2025Updated last year
- Helm Charts for RisingWave☆24Mar 16, 2026Updated 2 weeks ago
- npm package for ZetaSQL library☆16Sep 3, 2024Updated last year
- Data Mesh Pattern☆39Oct 18, 2023Updated 2 years ago
- ☆15Updated this week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Files for the Docker and Kubernetes on Google Cloud Hands-On labs☆11Mar 14, 2023Updated 3 years ago
- ☆14Oct 18, 2020Updated 5 years ago
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆45Mar 7, 2024Updated 2 years ago
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagster☆15Sep 9, 2021Updated 4 years ago
- ☆11Nov 26, 2024Updated last year
- A turnkey MLOps pipeline demonstrating how to go from raw events to real-time predictions at scale.☆243Oct 21, 2025Updated 5 months ago
- A Discord bot that integrates with Groq's LLM API to provide AI-powered question answering and assistance. Users can interact with differ…☆21May 18, 2025Updated 10 months ago
- ☆13Mar 30, 2024Updated last year
- ☆16Apr 1, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- SQLMesh example projects☆40Jul 2, 2025Updated 8 months ago
- GoogleSQL dialect format server using ZetaSQL☆22Dec 16, 2021Updated 4 years ago
- A Firebase Cloud Function and a Firebase hosted web app to treat weather data collected by Cloud IoT Core☆18Mar 10, 2019Updated 7 years ago
- A collection of examples built with AWS DataOps Development Kit (DDK)☆43Updated this week
- zetaSQL analyzer☆19Sep 11, 2020Updated 5 years ago
- ☆26Jun 29, 2023Updated 2 years ago
- Short Range Ultrasonic Radar - A simple radar using the ultrasonic sensor, this radar works by measuring a range from 3cm to 40 cm as non…☆19Nov 11, 2024Updated last year
- Spark-based pipeline to extract and parse monthly games from the Lichess database.☆21Sep 22, 2025Updated 6 months ago
- ☆10Apr 2, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The Modern Data Stack in a (Smaller) Box☆12Jan 28, 2023Updated 3 years ago
- Hybrid Vector Search☆27Feb 2, 2026Updated last month
- Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …☆12Updated this week
- minio as local storage and DynamoDB as catalog☆15May 14, 2024Updated last year
- Streaming Generative AI Application on AWS☆14Jun 24, 2024Updated last year
- Visualize linear programming at https://lpviz.net☆33Jan 20, 2026Updated 2 months ago
- ☆23Feb 18, 2026Updated last month