datazip-inc / olake
Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Starting with MongoDB
☆119Updated this week
Alternatives and similar repositories for olake:
Users that are interested in olake are comparing it to the libraries listed below
- Open Control Plane for Tables in Data Lakehouse☆321Updated this week
- QuackPipe is an OLAP API built on top of DuckDB with ClickHouse compatibility bits☆201Updated this week
- 🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊☆678Updated this week
- The bridge to effortless multi-engine data applications, currently supports Snowflake ❄️ and DuckDB 🦆☆149Updated this week
- The Open-Source Enterprise Data Platform in a single Portal☆230Updated this week
- ☆199Updated this week
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆316Updated last year
- DuckDB-powered analytics in Postgres☆152Updated 7 months ago
- 🏃♀️ Minimalist alternative to dbt☆232Updated this week
- DuckDB extension for Delta Lake☆153Updated this week
- Tektite DB☆182Updated 2 weeks ago
- Apache Hive Metastore as a Standalone server in Docker☆68Updated 5 months ago
- This is the main repository for SDF documentation found at docs.sdf.com, as well as public schemas, benchmarks, and examples☆106Updated 2 weeks ago
- An in-process Parquet merge engine for better data warehousing in S3 with MVCC☆138Updated this week
- Multi-hop declarative data pipelines☆107Updated this week
- Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.☆491Updated this week
- PyAirbyte brings the power of Airbyte to every Python developer.☆244Updated this week
- FlockMTL: DuckDB extension to seamlessly combine analytics and semantic analysis using language models (LMs)☆94Updated last week
- Unified MySQL, Postgres & FlightSQL Server, Powered by DuckDB.☆376Updated last week
- A playground for running duckdb as a stateless query engine over a data lake☆184Updated last year
- Open, Multi-modal Catalog for Data & AI, written in Rust☆76Updated 4 months ago
- Serverless multi-protocol + multi-destination event collection system.☆200Updated 2 months ago
- Connectors for capturing data from external data sources☆55Updated this week
- Turning PySpark Into a Universal DataFrame API☆354Updated this week
- The metrics layer for your data. Join us at https://metriql.com/slack☆303Updated last year
- Easily sync your Postgres database to a Snowflake, ClickHouse, or DuckDB warehouse.☆82Updated 2 months ago
- MemQ is an efficient, scalable cloud native PubSub system☆136Updated 2 months ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆249Updated last year
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆206Updated this week
- Work with your web service, database, and streaming schemas in a single format.☆337Updated 10 months ago