varchar-io / nebula
A distributed block-based data storage and compute engine
☆154Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for nebula
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆304Updated last year
- A portable Pythonic Data Catalog API powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture t…☆162Updated this week
- Serverless multi-protocol + multi-destination event collection system.☆194Updated last month
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆298Updated 5 months ago
- Data pipelines from re-usable components☆106Updated last year
- Demos of Materialize, the operational data warehouse.☆50Updated 2 months ago
- The metrics layer for your data. Join us at https://metriql.com/slack☆298Updated last year
- Distributed SQL Query Engine in Python using Ray☆238Updated last month
- Open-source metadata collector based on ODD Specification☆42Updated last year
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆226Updated 6 months ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆62Updated 2 years ago
- Vectorized executor to speed up PostgreSQL☆331Updated 9 years ago
- 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.☆65Updated this week
- Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization s…☆54Updated last year
- PostgreSQL extension providing approximate algorithms based on apache/datasketches-cpp☆85Updated 7 months ago
- The most valuable time series database in the universe☆33Updated 2 years ago
- Use SQL to build ELT pipelines on a data lakehouse.☆285Updated 2 years ago
- Data Tools Subjective List☆80Updated last year
- Ibis Substrait Compiler☆95Updated this week
- A Raft Library in C++ based on the Raft implementation in Apache Kudu☆122Updated this week
- ☆77Updated last year
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆77Updated this week
- Distributed SQL Engine in Python using Dask☆397Updated 2 months ago
- Apache Parquet Testing☆46Updated last week
- New file format for storage of large columnar datasets.☆451Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆144Updated this week
- ☆129Updated last month
- Work with your web service, database, and streaming schemas in a single format.☆330Updated 7 months ago
- AnyBlob - A Universal Cloud Object Storage Download Manager Built For Cost-Throughput Optimal Analytics!☆103Updated last month