varchar-io / nebula
A distributed block-based data storage and compute engine
☆154Updated 2 months ago
Alternatives and similar repositories for nebula:
Users that are interested in nebula are comparing it to the libraries listed below
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆301Updated 10 months ago
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆327Updated 2 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆61Updated 4 months ago
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆205Updated this week
- Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization s…☆54Updated last year
- Serverless multi-protocol + multi-destination event collection system.☆202Updated 5 months ago
- The metrics layer for your data. Join us at https://metriql.com/slack☆306Updated 2 years ago
- Distributed SQL Query Engine in Python using Ray☆243Updated 6 months ago
- Data pipelines from re-usable components☆108Updated 2 years ago
- Core C++ Sketch Library☆231Updated 2 months ago
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆241Updated 2 weeks ago
- Apache Iceberg C++☆63Updated this week
- ☆141Updated last week
- Use SQL to build ELT pipelines on a data lakehouse.☆286Updated 2 years ago
- Code repo for "An Empirical Evaluation of Columnar Storage Formats" VLDB Vol 17☆54Updated 11 months ago
- A runtime implementation of data-parallel actors.☆38Updated 3 years ago
- An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.☆246Updated 7 months ago
- Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.☆314Updated 3 months ago
- Distributed SQL Engine in Python using Dask☆402Updated 7 months ago
- ☆41Updated 3 years ago
- Apache datasketches☆95Updated 2 years ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆92Updated 2 years ago
- In-memory, columnar, arrow-based database.☆46Updated 2 years ago
- Open-source metadata collector based on ODD Specification☆43Updated last year
- Ibis Substrait Compiler☆102Updated this week
- On-demand ClickHouse playground☆60Updated this week
- Venice, Derived Data Platform for Planet-Scale Workloads.☆534Updated this week
- Data Tools Subjective List☆83Updated last year
- New file format for storage of large columnar datasets.☆532Updated this week
- Analytical database for data-driven Web applications 🪶☆482Updated 2 months ago