realtimedatalake / rtdlLinks
rtdl makes it easy to build and maintain a real-time data lake
☆43Updated 3 years ago
Alternatives and similar repositories for rtdl
Users that are interested in rtdl are comparing it to the libraries listed below
Sorting:
- Multi-hop declarative data pipelines☆122Updated this week
- Data Pipeline Automation Framework to build MCP servers, data APIs, and data lakes with SQL.☆204Updated this week
- In-Memory Analytics for Kafka using DuckDB☆146Updated last month
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆168Updated 3 months ago
- Support for generating modern platforms dynamically with services such as Kafka, Spark, Streamsets, HDFS, ....☆77Updated last week
- Use SQL to build ELT pipelines on a data lakehouse.☆288Updated 3 years ago
- Rewrite BigQuery, Redshift, Snowflake and Databricks queries into DuckDB compatible SQL (with deep transformation of functions, data type…☆64Updated this week
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆159Updated 3 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Updated last week
- Tektite DB☆184Updated 9 months ago
- Serverless multi-protocol + multi-destination event collection system.☆209Updated last year
- Demonstration of a Hive Input Format for Iceberg☆26Updated 4 years ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆94Updated 3 years ago
- Work with your web service, database, and streaming schemas in a single format.☆348Updated 3 months ago
- A home for LinkedIn's changes to Apache Iceberg☆63Updated last week
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.☆108Updated 7 months ago
- Apache iceberg Spark s3 examples☆20Updated last year
- sql-logic-test☆68Updated 2 years ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆61Updated 3 years ago
- Apache datasketches☆102Updated last week
- ☆34Updated last month
- Examples for using Apache Flink® with DataStream API, Table API, Flink SQL and connectors such as MySQL, JDBC, CDC, Kafka.☆65Updated 2 years ago
- Open Control Plane for Tables in Data Lakehouse☆375Updated this week
- Firebolt Core is a free, self-hosted edition of Firebolt's distributed query engine (https://www.firebolt.io/); it provides high-performa…☆186Updated this week
- Idempotent query executor☆53Updated 7 months ago
- ☆107Updated 2 years ago
- Generated Kafka protocol implementations☆34Updated 2 weeks ago
- Experimental version. A BYOC option for Snowflake workloads☆102Updated this week
- Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!☆235Updated 10 months ago
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆36Updated 2 years ago