realtimedatalake / rtdlLinks
rtdl makes it easy to build and maintain a real-time data lake
☆45Updated 2 years ago
Alternatives and similar repositories for rtdl
Users that are interested in rtdl are comparing it to the libraries listed below
Sorting:
- Multi-hop declarative data pipelines☆117Updated this week
- Data Streaming Framework to build data APIs, data lakes, and LLM tooling with SQL.☆122Updated this week
- In-Memory Analytics for Kafka using DuckDB☆129Updated last week
- A BYOC option for Snowflake workloads☆81Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆163Updated 7 months ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆60Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆29Updated last week
- Generated Kafka protocol implementations☆33Updated 2 weeks ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆75Updated last week
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆64Updated last year
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆79Updated 3 weeks ago
- Java implementation for performing operations on Apache Iceberg and Hive tables☆19Updated 2 months ago
- Firebolt Core is a free, self-hosted edition of Firebolt's distributed query engine (https://www.firebolt.io/); it provides high-performa…☆160Updated this week
- Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect☆13Updated 9 months ago
- Demos of Materialize, the operational data warehouse.☆51Updated 4 months ago
- This repository contains a recipe for bootstrapping a climate analysis application using Apache Pinot and Superset☆20Updated 4 years ago
- Cloud Storage Connector integrates Apache Pulsar with cloud storage.☆28Updated last week
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆43Updated 2 weeks ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆92Updated 4 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆61Updated 7 months ago
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆55Updated last week
- Tektite DB☆184Updated 4 months ago
- An open-source, community-driven REST catalog for Apache Iceberg!☆28Updated last year
- Yet Another (Spark) ETL Framework☆21Updated last year
- Java/Scala library for easily authoring Flyte tasks and workflows☆44Updated 2 months ago
- Dashboard for operating Flink jobs and deployments.☆37Updated 7 months ago
- ☆22Updated 4 months ago
- Use SQL to build ELT pipelines on a data lakehouse.☆287Updated 3 years ago
- Helm Charts for RisingWave☆20Updated last week