varchar-io / nebula
A distributed block-based data storage and compute engine
☆155Updated this week
Alternatives and similar repositories for nebula:
Users that are interested in nebula are comparing it to the libraries listed below
- Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization s…☆54Updated last year
- Use SQL to build ELT pipelines on a data lakehouse.☆284Updated 2 years ago
- The metrics layer for your data. Join us at https://metriql.com/slack☆303Updated last year
- Data pipelines from re-usable components☆108Updated last year
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆316Updated last year
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆298Updated 7 months ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated last month
- Beneath is a serverless real-time data platform ⚡️☆84Updated 2 years ago
- In Memory Property Graph Server using a Shared Nothing design☆43Updated last year
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆175Updated this week
- Open-source metadata collector based on ODD Specification☆43Updated last year
- Ibis Substrait Compiler☆98Updated this week
- Distributed SQL Query Engine in Python using Ray☆242Updated 3 months ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- ☆104Updated last year
- This is RonDB, a distribution of NDB Cluster developed and used by Hopsworks AB. It also contains development branches of RonDB.☆595Updated this week
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.☆97Updated this week
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆123Updated 3 years ago
- Love your Data. Love the Environment. Love VULKИ.☆43Updated 4 years ago
- Serverless multi-protocol + multi-destination event collection system.☆200Updated 2 months ago
- BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)☆235Updated 8 months ago
- Official repo for the Materialize + Redpanda + dbt Hack Day 2022, including a sample project to get everyone started!☆62Updated 2 years ago
- An open source, standard data file format for graph data storage and retrieval.☆229Updated last month
- chDB AWS Lambda container☆15Updated last year
- An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.☆424Updated 3 years ago
- The most valuable time series database in the universe☆33Updated 2 years ago
- Distributed SQL Engine in Python using Dask☆398Updated 5 months ago
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆92Updated 2 years ago
- Data Tools Subjective List☆82Updated last year
- ☆135Updated 4 months ago