datazip-inc / olake
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. β‘ Efficient, quick and scalable data ingestion for real-time analytics. Starting with MongoDB
β248Updated this week
Alternatives and similar repositories for olake:
Users that are interested in olake are comparing it to the libraries listed below
- Open Control Plane for Tables in Data Lakehouseβ328Updated this week
- π Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. πβ683Updated this week
- The smallest DuckDB SQL orchestrator on Earth.β285Updated last month
- πββοΈ Minimalist alternative to dbtβ237Updated this week
- DuckDB for streaming dataβ330Updated 2 weeks ago
- PyAirbyte brings the power of Airbyte to every Python developer.β252Updated last week
- The Open-Source Enterprise Data Platform in a single Portalβ233Updated this week
- Turning PySpark Into a Universal DataFrame APIβ370Updated this week
- This is the main repository for SDF documentation found at docs.sdf.com, as well as public schemas, benchmarks, and examplesβ114Updated 3 weeks ago
- The bridge to effortless multi-engine data applications, currently supports Snowflake βοΈ and DuckDB π¦β166Updated this week
- A playground for running duckdb as a stateless query engine over a data lakeβ185Updated last year
- QuackPipe is an OLAP API built on top of DuckDB with ClickHouse compatibility bitsβ208Updated this week
- Use dbt to manage real-time data transformations in RisingWave.β22Updated 2 months ago
- Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.β514Updated this week
- Serverless HTAP cloud data platform powered by Arrow Γ DuckDB Γ Icebergβ319Updated last year
- The metrics layer for your data. Join us at https://metriql.com/slackβ305Updated last year
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)β228Updated 2 months ago
- β218Updated this week
- DuckDB HTTP API Server and Query Interface in a Community Extensionβ157Updated 2 weeks ago
- Serverless multi-protocol + multi-destination event collection system.β202Updated 3 months ago
- A Rust based data/CSV/Parquet file generatorβ43Updated 3 months ago
- Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflakeβ¦β634Updated this week
- Simple ClickHouse UI that relies on system tables to help monitor and provide overview of your clusterβ137Updated this week
- Alto is a versatile data integration tool that allows you to easily run Singer plugins, build and cache PEX files encapsulating those pluβ¦β59Updated last year
- Snowflake AI Toolkit is an AI Accelerator and Playground for enabling AI in Snowflake. It is an Plug and Play Streamlit based Native App β¦β242Updated this week
- A Postgres Proxy Server in Pythonβ269Updated 2 months ago
- Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)β169Updated this week
- Multi-hop declarative data pipelinesβ111Updated this week
- Apache Hive Metastore as a Standalone server in Dockerβ68Updated 6 months ago