Delta Lake Documentation
☆53Jun 19, 2024Updated last year
Alternatives and similar repositories for delta-docs
Users that are interested in delta-docs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Delta Lake Website☆26Apr 1, 2026Updated last week
- Official Dockerfile for Delta Lake☆61Feb 24, 2026Updated last month
- Delta reader for the Ray open-source toolkit for building ML applications☆45Jan 27, 2024Updated 2 years ago
- Construindo Pipeline de Dados com Astro Python SDK, dbt & Apache Airflow☆10Mar 20, 2024Updated 2 years ago
- Delta lake and filesystem helper methods☆50Feb 29, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- Custom PySpark Connectors☆95Mar 3, 2026Updated last month
- Model Context Protocol (MCP) server to interact with gRPC services using the grpcurl tool☆16Mar 5, 2025Updated last year
- Delta Lake examples☆239Oct 8, 2024Updated last year
- A highly efficient daemon for streaming data from Kafka into Delta Lake☆430May 5, 2025Updated 11 months ago
- ☆15May 31, 2023Updated 2 years ago
- Manage Unity Catalog tables with Pydantic Models☆10Mar 5, 2025Updated last year
- ☆13Feb 19, 2025Updated last year
- native Go library for Delta Lake☆10Jul 31, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This is a showcase repository for the multi-genie agent solution☆24Feb 22, 2026Updated last month
- Delta Lake helper methods in PySpark☆328Jan 19, 2026Updated 2 months ago
- Hackerrank, Coursera, other studies☆13Aug 19, 2021Updated 4 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated last month
- Arrow Flight SQL Server for DuckDB☆143Mar 31, 2026Updated 2 weeks ago
- Sample scripts to use with Agentic Document Extraction (ADE).☆40Mar 26, 2026Updated 2 weeks ago
- Visits sessionization pipeline used for the talk☆13May 28, 2024Updated last year
- Example gaming leaderboard application covering streaming ingestion, CDC enrichment, processing and visualisation including demo of advan…☆21Nov 18, 2025Updated 4 months ago
- Unity Catalog AI Model Context Protocol Server☆16Mar 28, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆18Aug 6, 2024Updated last year
- PySpark test helper methods with beautiful error messages☆759Updated this week
- A native Rust library for Delta Lake, with bindings into Python☆3,184Updated this week
- streaming eight subreddits from reddit api using kafka producer & spark structured streaming.☆19Apr 5, 2026Updated last week
- A template for dockerized dbt-Core projects with VS Code Dev Containers.☆21Nov 14, 2022Updated 3 years ago
- ☆30Apr 28, 2021Updated 4 years ago
- An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…☆8,746Updated this week
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆16Oct 14, 2019Updated 6 years ago
- ScaleDP is an Open-Source extension of Apache Spark for Document Processing☆18Dec 2, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Distributed SQL Query Engine in Python using Ray☆245Oct 2, 2024Updated last year
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆227Mar 30, 2026Updated 2 weeks ago
- This repository is all you need to understand how to build Gen AI products or AI agents☆60Updated this week
- A low-dependency HTTP health check server for Scala☆13Apr 7, 2026Updated last week
- ☆61Feb 1, 2025Updated last year
- A complete real-time Change Data Capture (CDC) pipeline using Apache Flink, MariaDB, and Docker Compose. This project demonstrates how to…☆34Apr 6, 2026Updated last week
- ☆12Jul 22, 2025Updated 8 months ago