delta-io / kafka-delta-ingest
A highly efficient daemon for streaming data from Kafka into Delta Lake
☆370Updated last week
Related projects ⓘ
Alternatives and complementary repositories for kafka-delta-ingest
- Lakekeeper: A Rust native Iceberg REST Catalog☆235Updated this week
- Apache PyIceberg☆476Updated this week
- Nessie: Transactional Catalog for Data Lakes with Git-like semantics☆1,042Updated this week
- A native Delta implementation for integration with any query engine☆146Updated this week
- Apache DataFusion Comet Spark Accelerator☆823Updated this week
- ☆252Updated last month
- A library that provides useful extensions to Apache Spark and PySpark.☆196Updated 2 weeks ago
- Delta Lake helper methods in PySpark☆304Updated 2 months ago
- Open Control Plane for Tables in Data Lakehouse☆312Updated this week
- A Python Library to support running data quality rules while the spark job is running⚡☆163Updated last week
- Schema modelling framework for decentralised domain-driven ownership of data.☆248Updated 11 months ago
- An open protocol for secure data sharing☆771Updated last week
- Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)☆200Updated last week
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆406Updated 2 weeks ago
- Avro SerDe for Apache Spark structured APIs.☆231Updated 4 months ago
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆217Updated this week
- A native Rust library for Apache Hudi, with bindings into Python☆147Updated this week
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆342Updated 5 months ago
- Performance Observability for Apache Spark☆198Updated last week
- Apache Iceberg☆670Updated this week
- Spark style guide☆256Updated last month
- Pythonic Programming Framework to orchestrate jobs in Databricks Workflow☆189Updated this week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆307Updated last year
- Adapter for dbt that executes dbt pipelines on Apache Flink☆84Updated 8 months ago
- Snowflake Data Source for Apache Spark.☆219Updated this week