realtimedatalake / rtdlLinks
rtdl makes it easy to build and maintain a real-time data lake
☆43Updated 3 years ago
Alternatives and similar repositories for rtdl
Users that are interested in rtdl are comparing it to the libraries listed below
Sorting:
- Multi-hop declarative data pipelines☆122Updated last week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆165Updated last month
- Data Pipeline Automation Framework to build MCP servers, data APIs, and data lakes with SQL.☆135Updated this week
- In-Memory Analytics for Kafka using DuckDB☆142Updated this week
- Tektite DB☆184Updated 8 months ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated 2 weeks ago
- A BYOC option for Snowflake workloads☆101Updated last week
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆63Updated this week
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆60Updated 3 years ago
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Updated 2 years ago
- Use SQL to build ELT pipelines on a data lakehouse.☆288Updated 3 years ago
- Work with your web service, database, and streaming schemas in a single format.☆343Updated last month
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 2 years ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆77Updated this week
- Apache iceberg Spark s3 examples☆20Updated last year
- ☆107Updated 2 years ago
- A leightweight UI for Lakekeeper☆15Updated this week
- A dbt adapter for Decodable☆12Updated last month
- A Table format agnostic data sharing framework☆41Updated last year
- Generated Kafka protocol implementations☆34Updated 3 weeks ago
- Open Control Plane for Tables in Data Lakehouse☆370Updated this week
- Firebolt Core is a free, self-hosted edition of Firebolt's distributed query engine (https://www.firebolt.io/); it provides high-performa…☆177Updated last week
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.☆106Updated 5 months ago
- MemQ is an efficient, scalable cloud native PubSub system☆138Updated this week
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type …☆60Updated last week
- 🌟 Examples of use cases that utilize Decodable, as well as demos for related open-source projects such as Apache Flink, Debezium, and Po…☆84Updated 4 months ago
- Apache datasketches☆101Updated 2 years ago
- Java binding to Apache DataFusion☆83Updated 6 months ago
- Dashboard for operating Flink jobs and deployments.☆41Updated last month
- Idempotent query executor☆53Updated 6 months ago